Mini-Projekt 2020-03-25: Begriffsbaum mithilfe von SPARQL & Python erstellen - Wikidata-Visualisierungen
Begriffsbaum mithilfe von SPARQL & Python erstellen - Wikidata-Visualisierungen
Nebenherentdeckungen
(ich wusste es halt noch nicht ;-) )
Wikidata Graph Builder
https://angryloki.github.io/wikidata-graph-builder/
https://angryloki.github.io/wikidata-graph-builder/?mode=wdqs&wdqs=SELECT%20%3Fitem%20%3FitemLabel%20%0AWHERE%20%0A%7B%0A%20%20%3Fitem%20wdt:P31%7Cwdt:P527%7Cwdt:P279%20wd:Q17155032.%0A%20%20SERVICE%20wikibase:label%20%7B%20bd:serviceParam%20wikibase:language%20%22%5BAUTO_LANGUAGE%5D,en%22.%20%7D%0A%7D
https://angryloki.github.io/wikidata-graph-builder/?mode=wdqs&wdqs=SELECT%20%3Fitem%20%3FitemLabel%20%0AWHERE%20%0A%7B%0A%20%20%3Fitem%20wdt:P31%7Cwdt:P527%7Cwdt:P279%20wd:Q17155032.%0A%20%20SERVICE%20wikibase:label%20%7B%20bd:serviceParam%20wikibase:language%20%22%5BAUTO_LANGUAGE%5D,en%22.%20%7D%0A%7D
Wikidata:Tools/Visualize data/de - Wikidata
Coole Tools.
https://www.wikidata.org/wiki/Wikidata:Tools/Visualize_data/de
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Diese Art der Darstellung suche ich noch:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Diese Art der Darstellung suche ich noch:
Vielleicht ist bei den oben angegebenen Links was dabei, habe bisher nur kurz mal reingeschaut.
Sie stammt aus diesem Tutorial:
http://pyvandenbussche.info/2019/linked-data-combining-data-from-wikidata-and-eurostat-datasets/
Und mit dem daraus mir kopiertem Code (siehe unten!) habe ich folgende Grafiken erzeugt:
processed_results = json.load(result.response)
out = []
return pd.DataFrame(out, columns=cols)
countries = get_sparql_dataframe(endpoint, query)
linked_data = eurostat.merge(countries, how='inner', on='nuts_code')
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
https://projector.tensorflow.org/
..................................................................................
Und überhaupt sehr interessant, dieser Blog:
http://pyvandenbussche.info/
http://pyvandenbussche.info/2019/linked-data-past-present-2019-and-future/
http://pyvandenbussche.info/2017/translating-embeddings-transe/
Sie stammt aus diesem Tutorial:
Linked Data: combining data from Wikidata and Eurostat datasets
http://pyvandenbussche.info/2019/linked-data-combining-data-from-wikidata-and-eurostat-datasets/
Und mit dem daraus mir kopiertem Code (siehe unten!) habe ich folgende Grafiken erzeugt:
Code
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
import rdflib
from SPARQLWrapper import SPARQLWrapper, JSON
import json
#%matplotlib inline
def get_sparql_dataframe(service, query):
"""
Helper function to convert SPARQL results into a Pandas data frame.
"""
sparql = SPARQLWrapper(service)
sparql.setQuery(query)
sparql.setReturnFormat(JSON)
result = sparql.query()
processed_results = json.load(result.response)
cols = processed_results['head']['vars']
out = []
for row in processed_results['results']['bindings']:
item = []
for c in cols:
item.append(row.get(c, {}).get('value'))
out.append(item)
return pd.DataFrame(out, columns=cols)
endpoint = "https://query.wikidata.org/sparql"
query = """
select * {
?country wdt:P31 wd:Q3624078 .
?country rdfs:label ?name .
filter(lang(?name) = 'en')
?country wdt:P2250 ?life_exp .
?country wdt:P2132 ?GDP_per_capita .
?country wdt:P1082 ?population .
?country wdt:P297 ?code .
Optional{?country wdt:P605 ?nuts_code . }
}
"""
countries = get_sparql_dataframe(endpoint, query)
print(countries)
countries = countries.loc[~(countries['nuts_code'].str.len() > 2)]
countries['population'] = countries['population'].astype('float64')
countries['life_exp'] = countries['life_exp'].astype('float64')
countries['GDP_per_capita'] = countries['GDP_per_capita'].astype('float64')
print(countries.head())
# Similar to Hans Rosling presentation
countries['scaled population'] = ((countries['population'] - countries['population'].min())/200000+1).astype(int)
ax = countries.plot.scatter(x='GDP_per_capita', y='life_exp', s=countries['scaled population'],
figsize=(20, 10), cmap=cm.get_cmap('viridis'),c='life_exp', colorbar=False,
alpha=0.6, edgecolor=(0,0,0,.2));
countries[['GDP_per_capita','life_exp','code']].apply(lambda x: ax.text(*x),axis=1);
plt.show()
# Get Eurostat Linked Data through their portal: http://estatwrap.ontologycentral.com/
g=rdflib.Graph()
g.parse('http://estatwrap.ontologycentral.com/data/educ_regind', format='application/rdf+xml')
qres = g.query(
"""
PREFIX sdmx-measure: <http://purl.org/linked-data/sdmx/2009/measure#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX eus: <http://ontologycentral.com/2009/01/eurostat/ns#>
SELECT ?geo ?value
WHERE {
?obs dcterms:date "2012".
?obs eus:geo ?geo.
?obs eus:indic_ed <http://estatwrap.ontologycentral.com/dic/indic_ed#R05_1> .
?obs sdmx-measure:obsValue ?value .
}""")
# Participation rates of 4-years-olds in education at regional level
eurostat = pd.DataFrame(columns = ['eurostat_country','education_value'], data=[[str(geo), str(value)] for geo, value in qres])
eurostat['nuts_code'] = eurostat['eurostat_country'].str.split('#').str[1]
eurostat['education_value'] = eurostat['education_value'].astype(float)
eurostat.head()
linked_data = eurostat.merge(countries, how='inner', on='nuts_code')
linked_data
linked_data['scaled population'] = linked_data['scaled population']*10
ax2 = linked_data.plot.scatter(x='GDP_per_capita', y='education_value', s=linked_data['scaled population'],
figsize=(20, 10), cmap=cm.get_cmap('viridis'),c='education_value', colorbar=False,
alpha=0.6, edgecolor=(0,0,0,.2));
linked_data[['GDP_per_capita','education_value','nuts_code']].apply(lambda x: ax2.text(*x),axis=1);
plt.show()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Embedding Projector - Tensorflow
https://projector.tensorflow.org/
..................................................................................
Und überhaupt sehr interessant, dieser Blog:
http://pyvandenbussche.info/
http://pyvandenbussche.info/2019/linked-data-past-present-2019-and-future/
http://pyvandenbussche.info/2017/translating-embeddings-transe/
Kommentare
Kommentar veröffentlichen