Navigation

More options

Style variation

Close Menu

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Congratulations Rhinorhino on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Recent content by Predictor

Data mining - How create a decision threshold

Different thresholds will offer different trade-offs between errors and throughput: at one end, errors are less frequent but sometimes no outcome is chosen; at the other end, some outcome is always chosen, but errors are more frequent. You will need to decide what trade-off best suits your problem.
- Predictor
- Post #2
- Dec 18, 2011
- Forum: Data Warehousing/Big Data
train and test set are not compatible - Weka

Have you checked the documentation for the meaning of this error, or consulted the Weka Web page (http://www.cs.waikato.ac.nz/ml/weka/)?
- Predictor
- Post #2
- Nov 16, 2010
- Forum: Data Warehousing/Big Data
stronger correlation formula

If I have two series of numbers, series A contains either 1s or 0s, depending on if a patient took a pill or not. Series B contains random numbers. All of the series B numbers that coincide with the patient taking a pill have an average of 100, whereas those that coincide with NOT taking a pill...
- Predictor
- Post #7
- Dec 11, 2009
- Forum: Data Warehousing/Big Data
Data Mining Methodologies

Methodologies are largely checklists to help avoid overlooking anything. I don't think that one really presents a substantial advantage over the next. Personally, I use my own process, which I hone over time.
- Predictor
- Post #2
- Jul 21, 2009
- Forum: Data Warehousing/Big Data
Open Source Data Mining Tools

I wonder what experiences people here have had with open source data mining tools (Weka, Yale / RapidMiner, Orange, etc.)?
- Predictor
- Thread
- Feb 10, 2009
- Replies: 1
- Forum: Data Warehousing/Big Data
Collection Industry

In most cases, the most predictive in data regarding delinquent customers likelihood to pay will be their activity with the loan product (purchasing and payment activity on a credit card, etc.). Credit bureau data is also popular, and I know some people have had success with demographic data...
- Predictor
- Post #4
- Jan 3, 2008
- Forum: Data Warehousing/Big Data
Starting Data Mining

Try the FAQ section here for the item titled "Where can I find more information on data mining?" -Will Dwinnell http://matlabdatamining.blogspot.com/
- Predictor
- Post #3
- Jan 3, 2008
- Forum: Data Warehousing/Big Data
Collection Industry

I work for the collections department of a bank, building predictive models of customer behavior. Out data is stored in Oracle, which I retrieve to a PC for analysis and model development in MATLAB. Did you have more specific questions?
- Predictor
- Post #2
- Oct 29, 2007
- Forum: Data Warehousing/Big Data
Statistics vs. Data Mining

Linked below is another paper on the subject of data mining versus statistics, "Data Mining and Statistics: What's the Connection?", by Friedman: http://www-stat.stanford.edu/~jhf/ftp/dm-stat.pdf -Will
- Predictor
- Post #3
- Oct 26, 2007
- Forum: Data Warehousing/Big Data
searching data mining tool that detects repeating patterns

I suppose association rule analysis (also called "market basket analysis") might work. You can find a list of commercial and free tools which perform such analysis at: http://www.kdnuggets.com/software/associations.html If, however, you know how the groups will be defined (model, color, A/C...
- Predictor
- Post #2
- Sep 8, 2007
- Forum: Data Warehousing/Big Data
what to stratify by in data partition

In general, for train/test splitting, I try to stratify as much as possible within reason, and yes, I do stratify on the dependent variable. "Within reason" means: 1. I worry most about variables believed to be important, and 2. individual stratification cells should not become too small. You...
- Predictor
- Post #4
- Sep 8, 2007
- Forum: Data Warehousing/Big Data
Interpolation of Data

Yes, try the 'pchip' or 'spline' functions in MATLAB.
- Predictor
- Post #3
- Jun 14, 2007
- Forum: Fortran
what to stratify by in data partition

That depends on what one is trying to accomplish. Why are you stratifying the data?
- Predictor
- Post #2
- Jun 14, 2007
- Forum: Data Warehousing/Big Data
Family Recipe For Neural Networks

Readers here may be interested in my article,Family Recipe For Neural Networks, which was posted to the Data Mining and Predictive Analytics Web log: http://abbottanalytics.blogspot.com/2006/11/family-recipe-for-neural-networks.html#links I hope this is helpful!
- Predictor
- Thread
- Apr 29, 2007
- Replies: 0
- Forum: Data Warehousing/Big Data
Free And Inexpensive Data Mining Software

There is a post on the Data Mining and Predictive Analytics Web log, Free And Inexpensive Data Mining Software, which may be of interest: http://abbottanalytics.blogspot.com/2006/11/free-and-inexpensive-data-mining.html Note the discussion which follows in the Comments section, as well.
- Predictor
- Thread
- Apr 25, 2007
- Replies: 0
- Forum: Data Warehousing/Big Data

Part and Inventory Search

This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.

Accept Learn more…

Back

Top