2016年8月
Study of the Power-Law Fluctuations in the Email Size
Physics and Society
- ,
- 記述言語
- 英語
- 掲載種別
- 研究論文(学術雑誌)
In a previous study, we investigated the frequency distribution of the email<br />
size in the system log data of the main email server for the staff on a campus<br />
network. We found that the frequencies of email sizes followed a power-law<br />
distribution and discovered two inflection points in the distribution. After<br />
analyzing these results, we collected new system log data for both staff and<br />
students for the period from April 1, 2009 to March 31, 2015 and analyzed the<br />
frequency distributions per academic year. The results of the earlier<br />
investigation were replicated for each of these distributions. Then, we<br />
disaggregated the system log data for the staff for the period from May 1, 2015<br />
to July 31, 2015 using the email header "Content-Type" and created four<br />
subdistributions. Frequency distributions were calculated for the disaggregated<br />
data. We then proposed and evaluated a model to explain the overall frequency<br />
distribution as a sum of the four subdistributions. The correlation coefficient<br />
between the observed frequency distribution and the distribution predicted by<br />
our model was 0.8408 for the staff. This coefficient confirmed that our<br />
approach can successfully model and predict the size distribution of emails.
size in the system log data of the main email server for the staff on a campus<br />
network. We found that the frequencies of email sizes followed a power-law<br />
distribution and discovered two inflection points in the distribution. After<br />
analyzing these results, we collected new system log data for both staff and<br />
students for the period from April 1, 2009 to March 31, 2015 and analyzed the<br />
frequency distributions per academic year. The results of the earlier<br />
investigation were replicated for each of these distributions. Then, we<br />
disaggregated the system log data for the staff for the period from May 1, 2015<br />
to July 31, 2015 using the email header "Content-Type" and created four<br />
subdistributions. Frequency distributions were calculated for the disaggregated<br />
data. We then proposed and evaluated a model to explain the overall frequency<br />
distribution as a sum of the four subdistributions. The correlation coefficient<br />
between the observed frequency distribution and the distribution predicted by<br />
our model was 0.8408 for the staff. This coefficient confirmed that our<br />
approach can successfully model and predict the size distribution of emails.
- ID情報
-
- arXiv ID : arXiv:1608.08503