Списание Статистика

bg | en

APPLICABLE BIG DATA ASPECTS IN OFFICIAL STATISTICS

Galya Stateva

Резюме: This article is intended to acquaint the reader with an opportunity for practical application of ‘BIG DATA’, which was realized in the framework of an empirical research carried out by an NSI team on the topic of ‘Extracting information from the Internet for business features of enterprises (web-scraping)’. The Introduction of the article explains in detail the main purpose of the conducted empirical study, which is aimed at exploring the possibilities of applying ‘web-scraping’ and ‘text mining’ techniques and evaluating the effect of their use in the process of data collection and quality improvement of enterprise information from the NSI Statistical Business Register through access to their websites. Chapter I presents in detail the technological environment for the implementation of the empirical research, including a common reference logical architecture for the application of ‘web-scraping’ techniques. A detailed characteristic of the ‘web-scraping’ technique is presented and the cases in which various types of specific and generic ‘web-scraping’ are used is made. The presentation in Chapter II is devoted to the practical realization of the four pilot ‘scenarios’. The conduction and analysis of the results of the various ‘use-сases’ is presented in an analogous sequence - objective, resource and technological provision, results achieved, and legal constraints.

Ключови думи:

Дата на публикуване: 2018-11-01

Свали пълен текст