[datascience] Seminario Giuseppe Arbia - 29-4 ore 11.00-12.00


Cronologico Percorso di conversazione 
  • From: "Gianluca Cubadda" < >
  • To: "Informal Data Science Group" < >
  • Subject: [datascience] Seminario Giuseppe Arbia - 29-4 ore 11.00-12.00
  • Date: Wed, 15 Apr 2020 19:00:55 +0200

Care/i,

 

spero tutto bene.

 

Vi contatto per informarvi che il 29-4 ore 11.00-12.00 il Prof. Giuseppe Arbia (Università Cattolica di Roma) ci darà il seguente seminario tramite Teams:

 

On spatial econometric models estimated using crowdsourcing, web-scraping or other unconventionally collected Big Data

 

Abstract: The Big Data revolution is challenging the state-of-the-art statistical techniques not only for the computational burden connected with the high volume and speed which data are generated, but even more for the variety of sources through which data are collected (Arbia, 2019). This paper concentrates specifically on this last aspect. Common examples of non traditional Big Data sources are represented by crowdsourcing and web scraping. A common characteristic to these unconventional data collections is the lack of any precise statistical sample design, a situation described in statistics as “convenience sampling”. As it is well known, in these conditions no probabilistic inference is possible. To overcome this problem, Arbia et al. (2018) proposed the use of a special form of post-stratification (termed “post-sampling”), with which data are manipulated prior their use in an inferential context. In this paper we generalize this approach using the same idea to estimate a spatial regression model. We start showing through a Monte Carlo study that using data collected without a proper design, parameters’ estimates can be biased. Secondly, we propose a post sampling strategy to tackle this problem. We show that the proposed strategy indeed achieves a bias-reduction, but at the price of a concomitant increase in the variance of the estimators. We thus suggest an MSE-correction operational strategy. The paper also contains a formal derivation of the increase in variance implied by the post-sampling procedure and concludes with an empirical illustration of the method in the estimation of a hedonic price model in the city of Milan using web scraped data.

 

Il link per unirsi al Team del seminario è il seguente:

https://teams.microsoft.com/l/team/19%3a7549bf2fd118416ca28010866489f509%40thread.tacv2/conversations?groupId=991b2459-4168-4aa8-95a4-f16209585dc2&tenantId=24c5be2a-d764-40c5-9975-82d08ae47d0e

 

A seguire Giuseppe presenterà la sua candidatura a Presidente della SIS ma immagino che questo secondo appuntamento appassioni (eventualmente) solo gli statistici… 😉

 

Buona serata,

 

Gianluca

 

 

---
Gianluca Cubadda

Full Professor of Economic Statistics

Dean of the School of Economics (https://economia.uniroma2.it)

Coordinator of the Master Big Data in Business (http://bigdata.uniroma2.it)

University of Rome "Tor Vergata"

Via Columbia 2, 00133 Rome - Italy

Tel. +39 06 7259 5500

WEB page: http://directory.uniroma2.it/index.php/chart/dettagliDocente/3916

 



  • [datascience] Seminario Giuseppe Arbia - 29-4 ore 11.00-12.00, Gianluca Cubadda

Archivio con motore MhonArc 2.6.16.

§