开放数据源汇总(持续更新)

开放数据源汇总(持续更新)

4,004 次阅读 -资源

整理一份开放数据资源的笔记,供大家参考,欢迎将新发现的开放数据源反馈给数盟君:contact@dataunion.org,欢迎交流开放数据应用心得~

【法律行业】

中国裁判文书网

【政府公开】

北京市政务数据资源网

上海市政府数据服务网

【数据交换】

数据堂

Economics

American Economic Ass. (AEA):http://www.aeaweb.org/RFE/toc.php?show=complete

Gapminder:http://www.gapminder.org/data/

UMD::http://inforumweb.umd.edu/econdata/econdata.html

World bank:http://data.worldbank.org/indicator

Data Science Practice

This section contains data sets used in the book “Doing Data Science” by Rachel Schutt and Cathy O’Neil (O’Reilly 2014)

Datasets on the book site:https://github.com/oreillymedia/doing_data_science

Enron Email Dataset:http://www.cs.cmu.edu/~enron/

GetGlue (time stamped events: users rating TV shows):http://bit.ly/1aL8XS0

Titanic Survival Data Set:http://bit.ly/1kJ4pkF

Half a million Hubway rides:http://hubwaydatachallenge.org/trip-history-data/

Finance

CBOE Futures Exchange:http://cfe.cboe.com/Data/

Google Finance:https://www.google.com/finance(R)

Google Trends:http://www.google.com/trends?q=google&ctab=0&geo=all&date=all&sort=0

St Louis Fed:http://research.stlouisfed.org/fred2/(R)

NASDAQ:https://data.nasdaq.com/

OANDA:http://www.oanda.com/(R)

Quandl:http://www.quandl.com/

Yahoo Finance:http://finance.yahoo.com/(R)

Government

Archived national government statistics:http://www.archive-it.org/

Australia:http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/3301.02009?OpenDocument

Canada:http://www.data.gc.ca/default.asp?lang=En&n=5BCD274E-1

DataMarket:http://datamarket.com/

FDA:https://open.fda.gov/index.html

Fed Stats:http://www.fedstats.gov/cgi-bin/A2Z.cgi

Guardian world governments:http://www.guardian.co.uk/world-government-data

HUD:http://www.huduser.org/portal/datasets/pdrdatas.html

London, U.K. data:http://data.london.gov.uk/catalogue

New Zealand:http://www.stats.govt.nz/tools_and_services/tools/TableBuilder/tables-by…

NYC data:http://nycplatform.socrata.com/

OECD:http://www.oecd.org/document/0,3746,en_2649_201185_46462759_1_1_1_1,00.html

RITA:http://www.transtats.bts.gov/OT_Delay/OT_DelayCause1.asp

San Francisco Data sets:http://datasf.org/

U.K. Government Data:http://data.gov.uk/data

United Nations:http://data.un.org/

U.S. Federal Government Data Catalog:http://catalog.data.gov/dataset

U.S. Federal Government Agencies:http://www.data.gov/metric

US CDC Public Health datasets:http://www.cdc.gov/nchs/data_access/ftp_data.htm

The World Bank:http://wdronline.worldbank.org/

UK 2011 Census Open Atlas Project:http://www.alex-singleton.com/2011-census-open-atlas-project/

Health Care

Gapminder:http://www.gapminder.org/data/

Machine Learning

Amazon Web Services Data:http://aws.amazon.com/datasets

Airlines Data (2009 ASA Challenge):http://stat-computing.org/dataexpo/2009/the-data.html

Airports and their locations:http://www.infochimps.com/datasets/airports-and-their-locations

AppliedPredictiveModeling (R package):http://bit.ly/16wyvkG

Australian Weather:http://www.bom.gov.au/climate/dwo/

Causality Workbench:http://www.causality.inf.ethz.ch/repository.php

Edge data for US domestic flights 1990 to 2009:http://www.infochimps.com/datasets/us-domestic-flights-from-1990-to-2009

Infochimps (Tag = Bigdata):http://www.infochimps.com/tags/bigdata?page=1

Kaggle competition data:http://www.kaggle.com/

KDNuggets competition site:www.kdnuggets.com/datasets/

The Koblenz Network Collection:http://konect.uni-koblenz.de/

Machine Learning Data Set Repository:http://mldata.org/

Medicare Data File:http://go.cms.gov/19xxPN4

Microsoft Research:http://research.microsoft.com/apps/dp/dl/downloads.aspx

Million Song Dataset:http://blog.echonest.com/post/3639160982/million-song-dataset

More song datasets:http://labrosa.ee.columbia.edu/millionsong/pages/additional-datasets

MovieLens Data Sets:http://datahub.io/dataset/movielens

RDataMining.com R and Data Mining ebook data:http://www.rdatamining.com/data

The Revolution Analytics Collection:http://www.revolutionanalytics.com/subscriptions/datasets/

Social Networking:http://www.cs.cmu.edu/~jelsas/data/ancestry.com/

UCI Machine Learning Repository:http://archive.ics.uci.edu/ml/

53.5 billion clicks:http://cnets.indiana.edu/groups/nan/webtraffic/click-dataset

Networks

Stanford Large Network Dataset Collection:http://snap.stanford.edu/data/

Public Domain Collections

Data360:http://www.data360.org/index.aspx

Datamob.org:http://datamob.org/datasets

Factual:http://www.factual.com/topics/browse

Freebase:http://www.freebase.com/

Google:http://www.google.com/publicdata/directory

infochimps:http://www.infochimps.com/

numbray:http://numbrary.com/

Quora:http://www.quora.com/Data/Where-can-I-find-large-datasets-open-to-the-pu…

RS Collection 100+ :http://rs.io/2014/05/29/list-of-data-sets.html

Sample R data sets:http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html(R)

SourceForge Research Data:http://www.nd.edu/~oss/Data/data.html

StatSci.org:http://www.statsci.org/datasets.html

UFO Reports:http://www.nuforc.org/webreports.html

Wikileaks 911 pager intercepts:http://911.wikileaks.org/files/index.html

Stats4Stem.org: R data sets:http://www.stats4stem.org/data-sets.html(R)

The Washington Post List:http://www.washingtonpost.com/wp-srv/metro/data/datapost.html

Science

Agricultural Experiments:http://www.inside-r.org/packages/cran/agridat/docs/agridat(R)

Climate data:http://www.cru.uea.ac.uk/cru/data/temperature/#datter

andftp://ftp.cmdl.noaa.gov/

Gene Expression Omnibus:http://www.ncbi.nlm.nih.gov/geo/

Geo Spatial Data:http://geodacenter.asu.edu/datalist/

Human Microbiome Project:http://www.hmpdacc.org/reference_genomes/reference_genomes.php

MIT Cancer Genomics Data:http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi

NASA:http://nssdc.gsfc.nasa.gov/nssdc/obtaining_data.html

NIH Microarray data:ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE6532/(R)

Protein structure:http://www.infobiotic.net/PSPbenchmarks/

Public Gene Data:http://www.pubgene.org/

Stanford Microarray Data:http://smd.stanford.edu//

Social Sciences

General Social Survey:http://www3.norc.org/GSS+Website/

ICPSR:http://www.icpsr.umich.edu/icpsrweb/ICPSR/access/index.jsp

Pew Research:http://www.pewinternet.org/datasets/pages/2/

SNAP:http://snap.stanford.edu/data/index.html

UCLA Social Sciences Archive:http://dataarchives.ss.ucla.edu/Home.DataPortals.htm

UPJOHN INST:http://www.upjohn.org/erdc/erdc.html

Time Series

Time Series data Library:http://robjhyndman.com/TSDL/

Universities

Carnegie Mellon University Enron email:http://www.cs.cmu.edu/~enron/

Carnegie Mellon University StatLab:http://lib.stat.cmu.edu/datasets/

Keel Repository:http://sci2s.ugr.es/keel/datasets.php

Carnegie Mellon University JASA data archive:http://lib.stat.cmu.edu/jasadata/

Ohio State University Financial data:http://fisher.osu.edu/fin/osudata.htm

UC Berkeley:http://ucdata.berkeley.edu/

UCLA:http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data

UC Riverside Time Series:http://www.cs.ucr.edu/~eamonn/time_series_data/

University of Toronto:http://www.cs.toronto.edu/~delve/data/datasets.html

推荐阅读更多精彩内容

  • 此前我从未见过麻雀,活生生的麻雀。只在上学时于课本中认识了样子和名字。忽一日,我看见了四五只通体黄褐色带灰点的小...
    牵犁马阅读 238评论 4 2
  • 好歹在职场混了几年,说起运营的时候头头是道,但跳槽面试的时候,工资就是谈不上去;定岗就是拿不到高阶title和资源...
    莫无衣阅读 757评论 3 5
  • 12月31日,一如既往的告别仪式,已经不记得从哪一年起,不再闹哄哄地倒计时跨年,而是选择独自一人安安静静地回顾...
    百合燕燕阅读 148评论 0 3
  • 庆永思维【第1197期】在现在的圈子里,你可能是翘楚,在另一个圈子里,你可能是底线,别放大自己,也别小看自己,如水...
    风向标品牌策划阅读 31评论 0 1