<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>eric.ness.net &#187; Statistics</title>
	<atom:link href="http://eric.ness.net/archives/category/statistics/feed/" rel="self" type="application/rss+xml" />
	<link>http://eric.ness.net</link>
	<description>...I never learned to read.</description>
	<lastBuildDate>Sat, 21 Jan 2012 05:27:48 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Monte Carlo Simulations in C#</title>
		<link>http://eric.ness.net/archives/monte-carlo-simulations-in-c/</link>
		<comments>http://eric.ness.net/archives/monte-carlo-simulations-in-c/#comments</comments>
		<pubDate>Mon, 23 May 2011 20:04:58 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://eric.ness.net/?p=493</guid>
		<description><![CDATA[Monte Carlo Simulations in C#]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop Automatic --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fmonte-carlo-simulations-in-c%2F' data-shr_title='Monte+Carlo+Simulations+in+C%23'></a><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fmonte-carlo-simulations-in-c%2F' data-shr_title='Monte+Carlo+Simulations+in+C%23'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetTop Automatic --><p><img class="alignnone size-full wp-image-603" title="mcs_in_csharp1" src="http://eric.ness.net/wp-content/uploads/2011/05/mcs_in_csharp1.jpg" alt="" width="577" height="360" /></p>
<p>Let me say I am a huge fan of <a href="http://en.wikipedia.org/wiki/Monte_Carlo_method">Monte Carlo Simulations</a>. For those of you who are not familiar with <a href="http://www.vertex42.com/ExcelArticles/mc/MonteCarloSimulation.html">Monte Carlo Simulations</a> – Monte Carlo Simulations use random numbers and a model to simulate an outcome or event but the real strength in Monte Carlo Simulations is that you repeat the simulation hundreds if not thousands of times. After running the simulation, the results as a whole give you some great insight in to possible outcomes you can expect. The results are often more robust and accurate than other methods like regressions.</p>
<p>For this tutorial, I am going to replicate the core functionality of this <a href="http://www.lumenaut.com/images/montecarlo/monte_carlo_results1.htm">report</a> from <a href="http://www.lumenaut.com/montecarlo.htm">Lumenaut</a>. The model itself has a couple of fixed costs: Labor, Price per Widget and Rent. The model also has two variable items: Variable Cost per Widget and Number of Widgets Sold. So just like in real life if I want to figure out how much profit I am going to make I need to subtract my total costs from my revenue.</p>
<p><a href="http://eric.ness.net/wp-content/uploads/2011/05/lumenaut-monte-model.png"><img class="aligncenter size-full wp-image-570" title="lumenaut monte model" src="http://eric.ness.net/wp-content/uploads/2011/05/lumenaut-monte-model.png" alt="" width="264" height="323" /></a></p>
<p>But, where we see the strength of a Monte Carlo Simulation is that it says let’s take a look at this model again but, maybe I can buy my widgets a little cheaper and maybe I can sell a couple more. In addition, every month is a little different some you sell a couple more some a couple of less.</p>
<p>To simulate this we need to use some random numbers but a normal random number generator will not work because that will give us a uniform distribution. Some random number for a Monte Carlo Simulation needs to have very specific distributions: normal, log, triangular, gamma, or something else to better simulate real life. The <a href="http://reactnet.sourceforge.net/">React.NET</a> library gives us a very easy simple way to simulate all of these very easily.</p>
<p>So for Lumenaut’s model we need two normal distribution random number generators and the profit model outlined in the excel table and following graph. Please note that the <strong>Cost Per Widget</strong> is 5 with a standard deviation of 0.5 and the <strong>Number of Widgets Sold</strong> is 2,000 with a standard deviation of 200.</p>
<p><a href="http://eric.ness.net/wp-content/uploads/2011/05/sim-results.png"><img class="aligncenter size-full wp-image-574" title="sim results" src="http://eric.ness.net/wp-content/uploads/2011/05/sim-results.png" alt="" width="577" height="316" /></a></p>
<p>Here is the code I use.</p>
<pre class="brush: jscript; title: ; notranslate">
using System.Collections.Generic;
using React.Distribution;

namespace MonteCarloSimulation.Models
{
    /// &lt;summary&gt;
    /// Monte Carlo Simulation
    /// &lt;/summary&gt;
    public class MonteCarloModel
    {
        private const double FixedCost = 170000;
        private const double FixedSellingPrice = 100;
        private readonly double _costPerWidget;
        private readonly double _costPerWidgetSd;
        private readonly double _numOfSimulations;
        private readonly double _numOfWidgetSd;
        private readonly double _numOfWidgets;
        public int NumberOfSimulations = 10000;
        public List&lt;double&gt; RESULTS = new List&lt;double&gt;();

        /// &lt;summary&gt;
        /// Initializes a new instance of the &lt;see cref=&quot;MonteCarloModel&quot;/&gt; class.
        /// &lt;/summary&gt;
        /// &lt;param name=&quot;costPerWidget&quot;&gt;The cost per widget.&lt;/param&gt;
        /// &lt;param name=&quot;costPerWidgetSd&quot;&gt;The cost per widget sd.&lt;/param&gt;
        /// &lt;param name=&quot;numOfWidgets&quot;&gt;The num of widgets.&lt;/param&gt;
        /// &lt;param name=&quot;numOfWidgetSd&quot;&gt;The num of widget sd.&lt;/param&gt;
        /// &lt;param name=&quot;numOfSimulations&quot;&gt;The num of simulations.&lt;/param&gt;
        public MonteCarloModel(double costPerWidget,
                               double costPerWidgetSd,
                               double numOfWidgets,
                               double numOfWidgetSd,
                               double numOfSimulations)
        {
            _costPerWidget = costPerWidget;
            _costPerWidgetSd = costPerWidgetSd;
            _numOfWidgets = numOfWidgets;
            _numOfWidgetSd = numOfWidgetSd;
            _numOfSimulations = numOfSimulations;
            Run();
        }

        /// &lt;summary&gt;
        /// Runs the Monte Carlo Simulation
        /// &lt;/summary&gt;
        private void Run()
        {
            // Set up our Normal distributions with the mean, and the Standard Deviation
            var costPerWidgetDist = new Normal(_costPerWidget, _costPerWidgetSd);
            var numberOfWidgetDist = new Normal(_numOfWidgets, _numOfWidgetSd);

            for (int i = 0; i &lt; _numOfSimulations; i++)
            {
                // Get the next ranom number for our model
                double costPerWidget = costPerWidgetDist.NextDouble();
                double numberOfWidgetsSold = numberOfWidgetDist.NextDouble();

                // Calculate the revenue
                double revenue = numberOfWidgetsSold*FixedSellingPrice;

                // Calculate the costs
                double cost = (costPerWidget*numberOfWidgetsSold + FixedCost);

                // Add result to our results list
                RESULTS.Add(revenue - cost);
            }
        }
    }
}
</pre>
<p>Here is a little display I put together and as you can see, the results are very similar; they will not be exactly the same as there is a random component to all of this.</p>
<p><a href="http://eric.ness.net/wp-content/uploads/2011/05/1983634598.png"><img class="aligncenter size-full wp-image-577" title="1983634598" src="http://eric.ness.net/wp-content/uploads/2011/05/1983634598.png" alt="" width="577" height="360" /></a></p>
<p><a href="http://eric.ness.net/wp-content/uploads/2011/05/MonteCarloSimulation.zip">Download the code!</a></p>
<p><strong>Note:</strong><br />
There are a number of things to note here:</p>
<ol>
<li>This is nowhere near production code (read as good code) – it is simply for you to look at and get you going down the road to Monte Carlo Simulations. I think I spent all of 30 minutes putting it together so don’t expect anything nice and neat. :-)</li>
<li>I had to remove the Dundas graphing library, as I cannot distribute it per my license. But, it should work just fine with the <a href="http://www.microsoft.com/downloads/en/details.aspx?FamilyID=130f7986-bf49-4fe5-9ca8-910ae6ea442c&amp;DisplayLang=en">Microsoft graphing library</a>.</li>
<li>Enjoy!</li>
</ol>
<div class="shr-publisher-493"></div><!-- Start Shareaholic LikeButtonSetBottom Automatic --><!-- End Shareaholic LikeButtonSetBottom Automatic -->]]></content:encoded>
			<wfw:commentRss>http://eric.ness.net/archives/monte-carlo-simulations-in-c/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scatterplots Using R and MSSQL</title>
		<link>http://eric.ness.net/archives/scatterplots-using-r-and-mssql/</link>
		<comments>http://eric.ness.net/archives/scatterplots-using-r-and-mssql/#comments</comments>
		<pubDate>Fri, 27 Nov 2009 15:00:57 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://eric.ness.net/?p=412</guid>
		<description><![CDATA[Scatterplots Using R and MSSQL]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop Automatic --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fscatterplots-using-r-and-mssql%2F' data-shr_title='Scatterplots+Using+R+and+MSSQL'></a><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fscatterplots-using-r-and-mssql%2F' data-shr_title='Scatterplots+Using+R+and+MSSQL'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetTop Automatic --><p><a href="http://eric.ness.net/wp-content/uploads/2009/11/scatterplotwithr.jpg"><img class="alignnone size-full wp-image-416" title="scatterplotwithr" src="http://eric.ness.net/wp-content/uploads/2009/11/scatterplotwithr.jpg" alt="" width="577" height="360" /></a></p>
<p>As an extension of <a href="http://eric.ness.net/archives/histogram-lattices-using-r-and-mssql/">yesterdays post</a> here is another fairly cool chart you can do in <a href="http://www.r-project.org/">R</a>. For this little sample we are using the same data as before but for my sql query I have to do a crosstab query. So lets take a look at the code:</p>
<pre class="brush: jscript; title: ; notranslate">
# includes
library(RODBC)

# create connection
channel &lt;- odbcConnect(&quot;HealthDB&quot;)

# query database
myData &lt;- sqlQuery(channel, &quot;SELECT
Country AS 'Country',
Year AS 'Year',
[96741] AS 'GDP growth (annual %)--WDI-2009',
[96841] AS 'GDP per capita (constant 2000 US$)--WDI-2009',
[99941] AS 'Population growth (annual %)--WDI-2009',
[100041] AS 'Population, total--WDI-2009'
FROM
(
SELECT DISTINCT CountryID, Country, Year, IndicatorID, IndValue
FROM [Time Series Data]
WHERE (
((IndicatorID) = 96741) OR
((IndicatorID) = 96841) OR
((IndicatorID) = 99941) OR
((IndicatorID) = 100041))
AND
(((CountryID) = 4118) OR
((CountryID) = 4125) OR
((CountryID) = 4129) OR
((CountryID) = 4134) OR
((CountryID) = 4141) OR
((CountryID) = 4145) OR
((CountryID) = 4164) OR
((CountryID) = 4186) OR
((CountryID) = 4213) OR
((CountryID) = 4327) OR
((CountryID) = 4219) OR
((CountryID) = 4221) OR
((CountryID) = 4227) OR
((CountryID) = 4230) OR
((CountryID) = 4243) OR
((CountryID) = 4326) OR
((CountryID) = 4268) OR
((CountryID) = 4272) OR
((CountryID) = 4273) OR
((CountryID) = 4325) OR
((CountryID) = 4300) OR
((CountryID) = 4308) OR
((CountryID) = 4309) OR
((CountryID) = 4311) OR
((CountryID) = 4316))
AND
(NOT (Year IS NULL)) AND (Year &gt;= 1960) AND
(Year &lt;= 2007))
ps
PIVOT (
MAX(IndValue)
FOR IndicatorID IN ([96741], [96841], [99941], [100041] ) )
AS
pvt
order by Country, Year&quot;)

#close connection
odbcClose(channel)

#Plot charts
plot(myData[3:6], col=&quot;orange&quot;, main=&quot;Select Indicators for Europe and Central Asia&quot;)
</pre>
<p>Here is the result</p>
<p><a href="http://eric.ness.net/wp-content/uploads/2009/11/Scatterplot.jpg"><img class="alignnone size-full wp-image-414" title="Scatterplot" src="http://eric.ness.net/wp-content/uploads/2009/11/Scatterplot.jpg" alt="Scatterplot" width="577" height="400" /></a></p>
<div class="shr-publisher-412"></div><!-- Start Shareaholic LikeButtonSetBottom Automatic --><!-- End Shareaholic LikeButtonSetBottom Automatic -->]]></content:encoded>
			<wfw:commentRss>http://eric.ness.net/archives/scatterplots-using-r-and-mssql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Histogram Lattices Using R and MSSQL</title>
		<link>http://eric.ness.net/archives/histogram-lattices-using-r-and-mssql/</link>
		<comments>http://eric.ness.net/archives/histogram-lattices-using-r-and-mssql/#comments</comments>
		<pubDate>Thu, 26 Nov 2009 19:22:41 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://eric.ness.net/?p=402</guid>
		<description><![CDATA[Creating Histogram Lattices Using R and MSSQL]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop Automatic --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fhistogram-lattices-using-r-and-mssql%2F' data-shr_title='Histogram+Lattices+Using+R+and+MSSQL'></a><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fhistogram-lattices-using-r-and-mssql%2F' data-shr_title='Histogram+Lattices+Using+R+and+MSSQL'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetTop Automatic --><p><a href="http://eric.ness.net/wp-content/uploads/2009/11/histogramlattice.jpg"><img class="alignnone size-full wp-image-405" title="histogramlattice" src="http://eric.ness.net/wp-content/uploads/2009/11/histogramlattice.jpg" alt="" width="577" height="360" /></a></p>
<p>After getting Joseph Adler&#8217;s book &#8220;<a href="http://oreilly.com/catalog/9780596009427">Baseball Hacks</a>&#8221; I&#8217;ve been wanting to get in to <a href="http://www.r-project.org/">R</a>. R is simply an amazing open source statistics/graphing application. For this example we are going pull data from a MSSQL database and make a histogram lattice of a couple of countries.</p>
<p>First, I pulled the data from <a href="http://healthsystems2020.healthsystemsdatabase.org/datasets/timeseriesdataset.aspx">HealthSystems2020</a> time series database and imported the data in to MSSQL. I did some minor touch ups to the database giving the indicator an id etc. The second thing you need to do is create an ODBC connection for your database here is a fairly good <a href="http://www.devasp.com/samples/dsn_sql.asp">tutorial</a>. In this example I called my ODBC DSN &#8220;HealthDB&#8221;. Also make sure you adjust you sql query so that they are pulling the correct names/values.</p>
<p>Finally, here is the code:</p>
<pre class="brush: jscript; title: ; notranslate">

# includes
library(RODBC)
library(lattice)

# create connection
channel &lt;- odbcConnect(&quot;HealthDB&quot;)

# query database
myData &lt;- sqlQuery(channel, &quot;SELECT Country, IndValue
FROM         [YOURTABLE]
WHERE     (id = 96841) AND (
(Country = 'Afghanistan') OR
(Country = 'Bangladesh') OR
(Country = 'Bhutan') OR
(Country = 'India') OR
(Country = 'Maldives') OR
(Country = 'Nepal') OR
(Country = 'Pakistan') OR
(Country = 'China') OR
(Country = 'Indonesia') OR
(Country = 'Sri Lanka'))&quot;)

#close connection
odbcClose(channel)

#create histogram
histogram(~ myData[,3] | myData[,1], type=&quot;count&quot;, col=&quot;red&quot;, main = &quot;GDP per capita (constant 2000 US$)&quot;, xlab=&quot;Country&quot;)
</pre>
<p>And here is the result:<br />
<a href="http://eric.ness.net/wp-content/uploads/2009/11/histogram.jpg"><img class="alignnone size-full wp-image-403" title="histogram" src="http://eric.ness.net/wp-content/uploads/2009/11/histogram.jpg" alt="histogram" width="577" height="376" /></a></p>
<div class="shr-publisher-402"></div><!-- Start Shareaholic LikeButtonSetBottom Automatic --><!-- End Shareaholic LikeButtonSetBottom Automatic -->]]></content:encoded>
			<wfw:commentRss>http://eric.ness.net/archives/histogram-lattices-using-r-and-mssql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pearson&#8217;s Correlation Coefficient</title>
		<link>http://eric.ness.net/archives/pearsons-correlation-coefficient/</link>
		<comments>http://eric.ness.net/archives/pearsons-correlation-coefficient/#comments</comments>
		<pubDate>Sun, 25 Oct 2009 19:33:25 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://eric.ness.net/?p=272</guid>
		<description><![CDATA[Pearson's Correlation Coefficient walk through]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop Automatic --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fpearsons-correlation-coefficient%2F' data-shr_title='Pearson%27s+Correlation+Coefficient'></a><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fpearsons-correlation-coefficient%2F' data-shr_title='Pearson%27s+Correlation+Coefficient'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetTop Automatic --><p><a href="http://eric.ness.net/wp-content/uploads/2009/10/Pearsons_Correlation_Coeffi.jpg"><img class="alignnone size-full wp-image-282" title="Pearsons_Correlation_Coeffi" src="http://eric.ness.net/wp-content/uploads/2009/10/Pearsons_Correlation_Coeffi.jpg" alt="" width="577" height="360" /></a></p>
<p>In Toby Segaran&#8217;s book &#8220;Programming Collective Intelligence&#8221; one additional methods used &#8220;to determine the similarity between people&#8217;s interests is to use the Pearson&#8217;s correlation coefficient. In statistics Pearson&#8217;s correlation coefficient is often symbolized as simply r. I also covered Toby&#8217;s Euclidean Distance Score <a title="http://eric.ness.net/archives/euclidean-distance-score/" href="http://eric.ness.net/archives/euclidean-distance-score/">here</a>.</p>
<p><img src="http://eric.ness.net/wp-content/uploads/2009/10/hl_correl_frm_r.png" alt="hl_correl_frm_r" width="197" height="102" /></p>
<p>Is how r is calculated.</p>
<p>And here is some sloppy source code to get you going:</p>
<pre class="brush: jscript; title: ; notranslate">
using System;
using System.Linq;

namespace PearsonTest
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            var myP = new Correlation();

            var lisaRose = new double[] {0, 2, 4, 6, 8, 10, 12};
            var jackMatthews = new[] {2.1, 5, 9, 12.6, 17.3, 21, 24.7};

            double score = myP.PearsonCorrelation(lisaRose, jackMatthews);

            Console.WriteLine(score);
            Console.ReadLine();

            // The answer is 0.99887956534852
        }
    }

    internal class Correlation
    {
        public double PearsonCorrelation(double[] x, double[] y)
        {
            double result;
            double xMean = 0;
            double yMean = 0;
            double xDenom = 0;
            double yDenom = 0;
            double denominator;
            double numerator = 0;
            double n;

            // Make sure arrays are same size and greater than 1
            if ((x.Count() == y.Count()) &amp;&amp; (x.Count() &gt;= 1))
            {
                n = x.Count();
            }
            else
            {
                result = 0;
                return result;
            }

            // Find Means
            for (int i = 0; i &lt;= n - 1; i++)
            {
                xMean += x[i];
                yMean += y[i];
            }
            xMean = xMean/n;
            yMean = yMean/n;

            // Caluculate numerator and denominator
            for (int i = 0; i &lt;= n - 1; i++)
            {
                //Caluculate numerator
                double numX = x[i] - xMean;
                double numY = y[i] - yMean;
                numerator += numX*numY;

                // Caluculate denominator parts
                xDenom += Math.Pow(numX, 2);
                yDenom += Math.Pow(numY, 2);
            }

            // Caluculate denominator
            denominator = Math.Sqrt(xDenom*yDenom);

            // Check for division by zero
            if (denominator == 0)
            {
                result = 0;
            }
            else
            {
                result = numerator/denominator;
            }

            return result;
        }
    }
}
</pre>
<div class="shr-publisher-272"></div><!-- Start Shareaholic LikeButtonSetBottom Automatic --><!-- End Shareaholic LikeButtonSetBottom Automatic -->]]></content:encoded>
			<wfw:commentRss>http://eric.ness.net/archives/pearsons-correlation-coefficient/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The Elements of Statistical Learning: Data Mining, Inference, and Prediction. [Free Book]</title>
		<link>http://eric.ness.net/archives/the-elements-of-statistical-learning-data-mining-inference-and-prediction-free-book/</link>
		<comments>http://eric.ness.net/archives/the-elements-of-statistical-learning-data-mining-inference-and-prediction-free-book/#comments</comments>
		<pubDate>Thu, 15 Oct 2009 17:41:21 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://eric.ness.net/?p=233</guid>
		<description><![CDATA[Free e-book - The Elements of Statistical Learning: Data Mining, Inference, and Prediction]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop Automatic --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fthe-elements-of-statistical-learning-data-mining-inference-and-prediction-free-book%2F' data-shr_title='The+Elements+of+Statistical+Learning%3A+Data+Mining%2C+Inference%2C+and+Prediction.+%5BFree+Book%5D'></a><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fthe-elements-of-statistical-learning-data-mining-inference-and-prediction-free-book%2F' data-shr_title='The+Elements+of+Statistical+Learning%3A+Data+Mining%2C+Inference%2C+and+Prediction.+%5BFree+Book%5D'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetTop Automatic --><p><a href="http://eric.ness.net/wp-content/uploads/2009/10/elements_of_stat_learning.jpg"><img class="alignnone size-full wp-image-234" title="elements_of_stat_learning" src="http://eric.ness.net/wp-content/uploads/2009/10/elements_of_stat_learning.jpg" alt="" width="577" height="360" /></a></p>
<p>I came across this on Our Signal today, it&#8217;s a free e-book: <strong>The Elements of Statistical Learning: Data Mining, Inference, and Prediction </strong>written by Trevor Hastie, Robert Tibshirani and Jerome Friedman. The book and accompanying site is located <a href="http://www-stat.stanford.edu/~tibs/ElemStatLearn//">here</a>.</p>
<div class="shr-publisher-233"></div><!-- Start Shareaholic LikeButtonSetBottom Automatic --><!-- End Shareaholic LikeButtonSetBottom Automatic -->]]></content:encoded>
			<wfw:commentRss>http://eric.ness.net/archives/the-elements-of-statistical-learning-data-mining-inference-and-prediction-free-book/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nate Silver &amp; Conn Carroll Video.</title>
		<link>http://eric.ness.net/archives/nate-silver-conn-carroll-video/</link>
		<comments>http://eric.ness.net/archives/nate-silver-conn-carroll-video/#comments</comments>
		<pubDate>Tue, 16 Jun 2009 17:59:42 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Video/Audio]]></category>
		<category><![CDATA[Video]]></category>

		<guid isPermaLink="false">http://eric.ness.net/?p=221</guid>
		<description><![CDATA[Nate Silver &#038; Conn Carroll interview]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop Automatic --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fnate-silver-conn-carroll-video%2F' data-shr_title='Nate+Silver+%26+Conn+Carroll+Video.'></a><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fnate-silver-conn-carroll-video%2F' data-shr_title='Nate+Silver+%26+Conn+Carroll+Video.'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetTop Automatic --><p><a href="http://eric.ness.net/wp-content/uploads/2009/06/NateSilver_ConnCarroll.jpg"><img class="alignnone size-full wp-image-222" title="NateSilver_ConnCarroll" src="http://eric.ness.net/wp-content/uploads/2009/06/NateSilver_ConnCarroll.jpg" alt="" width="577" height="360" /></a></p>
<p>I know this is a bit late but it give some pretty good insight in to Nate Silver of <a title="fivethirtyeight.com" href="http://www.fivethirtyeight.com" target="_blank">fivethirtyeight.com</a> and how he analyzes polls.<br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="577" height="360" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="flashvars" value="playlist=http%3A%2F%2Fbloggingheads%2Etv%2Fdiavlogs%2Fliveplayer%2Dplaylist%2F13822%2F00%3A00%2F40%3A04" /><param name="src" value="http://static.bloggingheads.tv/maulik/offsite/offsite_flvplayer.swf" /><embed type="application/x-shockwave-flash" width="577" height="360" src="http://static.bloggingheads.tv/maulik/offsite/offsite_flvplayer.swf" flashvars="playlist=http%3A%2F%2Fbloggingheads%2Etv%2Fdiavlogs%2Fliveplayer%2Dplaylist%2F13822%2F00%3A00%2F40%3A04"></embed></object></p>
<p>Source: [<a title="bloggingheads.tv" href="http://bloggingheads.tv/diavlogs/13822">bloggingheads.tv</a>]</p>
<div class="shr-publisher-221"></div><!-- Start Shareaholic LikeButtonSetBottom Automatic --><!-- End Shareaholic LikeButtonSetBottom Automatic -->]]></content:encoded>
			<wfw:commentRss>http://eric.ness.net/archives/nate-silver-conn-carroll-video/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New York Stock Exchange Listing</title>
		<link>http://eric.ness.net/archives/new-york-stock-exchange-listing/</link>
		<comments>http://eric.ness.net/archives/new-york-stock-exchange-listing/#comments</comments>
		<pubDate>Sat, 28 Mar 2009 16:29:22 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Misc.]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Stocks]]></category>

		<guid isPermaLink="false">http://eric.ness.net/?p=159</guid>
		<description><![CDATA[I guess it was only a matter of time before I started to take a look at the NYSE.]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop Automatic --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fnew-york-stock-exchange-listing%2F' data-shr_title='New+York+Stock+Exchange+Listing'></a><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fnew-york-stock-exchange-listing%2F' data-shr_title='New+York+Stock+Exchange+Listing'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetTop Automatic --><p>I guess it was only a matter of time before I started to take a look at the NYSE, in part because I use a lot of the tools analysts use in looking at stocks but, I use them when looking at the environment or health systems.</p>
<p>Anyway, it is oddly difficult to find a complete listing of the stocks on the exchange &#8211; so I compiled a list of all the stocks (not including NYSE ARCA, NYSE EURONEXT, NYSE ALTERNEXT).</p>
<p><a href="http://eric.ness.net/wp-content/uploads/2009/03/nyse.xls">Download the excel file</a>. [<a href="http://www.nyse.com/about/listed/lc_ny_name_A.html?ListedComp=All">Source</a>]</p>
<p>Some of the things I use or find interesting when looking at indicators:</p>
<p><a href="http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:moving_average_conve">Moving Average Convergence/Divergence (MACD)</a><br />
<a href="http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:relative_strength_in">Relative Strength Index (RSI) </a><br />
<a href="http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:williams_r">Williams %R</a><br />
<a href="http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:bollinger_bands">Bollinger Bands</a></p>
<div class="shr-publisher-159"></div><!-- Start Shareaholic LikeButtonSetBottom Automatic --><!-- End Shareaholic LikeButtonSetBottom Automatic -->]]></content:encoded>
			<wfw:commentRss>http://eric.ness.net/archives/new-york-stock-exchange-listing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Euclidean Distance Score</title>
		<link>http://eric.ness.net/archives/euclidean-distance-score/</link>
		<comments>http://eric.ness.net/archives/euclidean-distance-score/#comments</comments>
		<pubDate>Fri, 31 Oct 2008 02:14:19 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[C#]]></category>

		<guid isPermaLink="false">http://eric.ness.net/?p=81</guid>
		<description><![CDATA[I am currently reading Toby Segaran's book "Programming Collective Intelligence" and one of the first topics it covers is how do you determine of similar two people are.]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop Automatic --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Feuclidean-distance-score%2F' data-shr_title='Euclidean+Distance+Score'></a><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Feuclidean-distance-score%2F' data-shr_title='Euclidean+Distance+Score'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetTop Automatic --><p><a href="http://eric.ness.net/wp-content/uploads/2008/10/euclidean.jpg"><img class="alignnone size-full wp-image-82" title="euclidean" src="http://eric.ness.net/wp-content/uploads/2008/10/euclidean.jpg" alt="" width="577" height="360" /></a></p>
<p>I am currently reading Toby Segaran&#8217;s book &#8220;Programming Collective Intelligence&#8221; and one of the first topics it covers is how do you determine of similar two people are.</p>
<p>One approach is to use the Euclidean Distance Score. Arun Vijayan C has an excellent power point presentation &#8220;Finding more people like you&#8221; &#8211; on this topic:</p>
<div id="__ss_407295" style="width: 425px; text-align: left;"><a style="font: 14px Helvetica,Arial,Sans-serif; display: block; margin: 12px 0 3px 0; text-decoration: underline;" title="Finding more people like you" href="http://www.slideshare.net/arunv/finding-more-people-like-you?type=powerpoint">Finding more people like you</a><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slideshare.net/swf/ssplayer2.swf?doc=finding-people-like-you-1210843795287894-8&amp;stripped_title=finding-more-people-like-you" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="425" height="355" src="http://static.slideshare.net/swf/ssplayer2.swf?doc=finding-people-like-you-1210843795287894-8&amp;stripped_title=finding-more-people-like-you" allowscriptaccess="always" allowfullscreen="true"></embed></object>&nbsp;</p>
<div style="font-size: 11px; font-family: tahoma,arial; height: 26px; padding-top: 2px;">View SlideShare <a style="text-decoration: underline;" title="View Finding more people like you on SlideShare" href="http://www.slideshare.net/arunv/finding-more-people-like-you?type=powerpoint">presentation</a> or <a style="text-decoration: underline;" href="http://www.slideshare.net/upload?type=powerpoint">Upload</a> your own. (tags: <a style="text-decoration: underline;" href="http://slideshare.net/tag/socialnetwork">socialnetwork</a> <a style="text-decoration: underline;" href="http://slideshare.net/tag/compare">compare</a>)</div>
</div>
<p>I then wrote up some quick code in C# that uses the values in Arun&#8217;s presentation:</p>
<pre class="brush: jscript; title: ; notranslate">
// Euclidean Distance Score

using System;
using System.Collections.Generic;

namespace ConsoleApplication1
{
    internal class Program
    {
        private static void Main()
        {
            // Load People and Values
            var myP = new List
                          {
                              new People(&quot;John&quot;, 1.5, 4),
                              new People(&quot;Ravi&quot;, 4.5, 1.5),
                              new People(&quot;Kiran&quot;, 1, 3.5),
                              new People(&quot;Deepti&quot;, 3, 5)
                          };

            // Print header
            Console.WriteLine(&quot;People And Scores&quot;);
            Console.WriteLine(&quot;###################&quot;);
            Console.WriteLine();

            // Loop through people and values
            foreach (People people in myP)
            {
                Console.WriteLine(people.Name + &quot;\t&quot; + people.xScore + &quot;\t&quot; + people.yScore);
            }

            // Print Distance And Value Headers
            Console.WriteLine();
            Console.WriteLine(&quot;Distance Comparison&quot;);
            Console.WriteLine(&quot;###################&quot;);
            Console.WriteLine();

            // Loop through people and scores
            int myCount = 1;
            for (int i = 0; i &lt; myP.Count; i++)
            {
                for (int j = myCount; j &lt; myP.Count; j++)
                {
                    // Euclidean Distance Score
                    // Sqrt( (x1-x2)^2 + (y1+y2)^2)
                    Console.WriteLine(myP[i].Name + &quot;\t&quot; + myP[j].Name + &quot;:\t&quot; +
                                      Math.Sqrt(Math.Pow(myP[i].xScore - myP[j].xScore, 2) +
                                                Math.Pow(myP[i].yScore - myP[j].yScore, 2)).ToString(&quot;0.##&quot;));
                }

                // Skip to the next guy
                myCount++;
            }

            // Print Closer
            Console.WriteLine();
            Console.WriteLine(&quot;Press enter to continue...&quot;);
            Console.ReadLine();
        }
    }

    internal class People
    {
        public string Name;
        public double xScore;
        public double yScore;

        public People(string _Name, double _xScore, double _yScore)
        {
            Name = _Name;
            xScore = _xScore;
            yScore = _yScore;
        }
    }
}
</pre>
<div class="shr-publisher-81"></div><!-- Start Shareaholic LikeButtonSetBottom Automatic --><!-- End Shareaholic LikeButtonSetBottom Automatic -->]]></content:encoded>
			<wfw:commentRss>http://eric.ness.net/archives/euclidean-distance-score/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Presidential Election Poll Stats</title>
		<link>http://eric.ness.net/archives/presidential-election-poll-stats/</link>
		<comments>http://eric.ness.net/archives/presidential-election-poll-stats/#comments</comments>
		<pubDate>Wed, 10 Sep 2008 18:01:33 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Misc.]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://eric.ness.net/?p=70</guid>
		<description><![CDATA[I am love watching the polls. Here are some of my favorite links.]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop Automatic --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fpresidential-election-poll-stats%2F' data-shr_title='Presidential+Election+Poll+Stats'></a><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fpresidential-election-poll-stats%2F' data-shr_title='Presidential+Election+Poll+Stats'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetTop Automatic --><p><a href="http://eric.ness.net/wp-content/uploads/2008/09/election_map.jpg"><img class="alignnone size-full wp-image-71" title="election_map" src="http://eric.ness.net/wp-content/uploads/2008/09/election_map.jpg" alt="" width="577" height="360" /></a></p>
<p>I love watching the polls. So I thought I would post some of my favorite links to site that I think do a pretty good job at covering them.</p>
<p><a title="http://pollster.com/" href="http://pollster.com/">Pollster</a></p>
<p><a title="http://www.realclearpolitics.com" href="http://www.realclearpolitics.com/epolls/2008/president/us/general_election_mccain_vs_obama-225.html">Real Clear Politics</a></p>
<p><a title="http://www.fivethirtyeight.com/" href="http://www.fivethirtyeight.com/">Five Thirty Eight</a></p>
<p><a title="http://www.intrade.com" href="http://www.intrade.com/jsp/intrade/trading/t_index.jsp">Intrade</a></p>
<div class="shr-publisher-70"></div><!-- Start Shareaholic LikeButtonSetBottom Automatic --><!-- End Shareaholic LikeButtonSetBottom Automatic -->]]></content:encoded>
			<wfw:commentRss>http://eric.ness.net/archives/presidential-election-poll-stats/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Benford&#8217;s Law</title>
		<link>http://eric.ness.net/archives/benfords-law/</link>
		<comments>http://eric.ness.net/archives/benfords-law/#comments</comments>
		<pubDate>Thu, 04 Sep 2008 16:58:22 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Video/Audio]]></category>
		<category><![CDATA[Video]]></category>

		<guid isPermaLink="false">http://eric.ness.net/?p=66</guid>
		<description><![CDATA[Benford's Law roughly stats that values of measurements in nature often have a logarithmic distribution. ]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop Automatic --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fbenfords-law%2F' data-shr_title='Benford%27s+Law'></a><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Feric.ness.net%2Farchives%2Fbenfords-law%2F' data-shr_title='Benford%27s+Law'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetTop Automatic --><p><a href="http://eric.ness.net/wp-content/uploads/2008/09/bensfordlaw.jpg"><img class="alignnone size-full wp-image-67" title="bensfordlaw" src="http://eric.ness.net/wp-content/uploads/2008/09/bensfordlaw.jpg" alt="" width="577" height="360" /></a></p>
<p>Benford&#8217;s Law roughly states that  values of measurements in nature often have a logarithmic distribution. Which produces a counter-intuitive result that when applied to a wide variety of natural data sets, including stock prices and population numbers, etc. these things tend to follow that distribution.</p>
<p>In this video it give a rough explanation of how Benford&#8217;s Law can be used in Fraud Detection.<br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="344" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="src" value="http://www.youtube.com/v/O8N26edbqLM&amp;hl=en&amp;fs=1" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="425" height="344" src="http://www.youtube.com/v/O8N26edbqLM&amp;hl=en&amp;fs=1" allowfullscreen="true"></embed></object></p>
<div class="shr-publisher-66"></div><!-- Start Shareaholic LikeButtonSetBottom Automatic --><!-- End Shareaholic LikeButtonSetBottom Automatic -->]]></content:encoded>
			<wfw:commentRss>http://eric.ness.net/archives/benfords-law/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

