Category Archives: Statistics

Handle bimodal dataset

If you have bimodal (or higher) dataset, then calculate the mean and subtract that from the dataset and finally calculate the absolute value of it. abs(row[‘ColumnX’] – m) Or apply central limit theory and classify points for the two distributions

0  

Tinkering with Apache Hadoop – Map Reduce Framework

I have used a Map Reduce based system at my present employer (Bank of America – Merrill Lynch) to process (read “crunch”) extremely large datasets in matter of seconds. Sometimes I used those to price bonds in real-time otherwise it was used for data processing/reporting purposes. It is an in-house product, known as Hugs framework […]

2