Mini-Projekt 2020-03-25: Begriffsbaum mithilfe von SPARQL & Python erstellen - Wikidata-Visualisierungen

März 26, 2020

Begriffsbaum mithilfe von SPARQL & Python erstellen - Wikidata-Visualisierungen

Nebenherentdeckungen

(ich wusste es halt noch nicht ;-) )

Wikidata Graph Builder

https://angryloki.github.io/wikidata-graph-builder/

https://angryloki.github.io/wikidata-graph-builder/?mode=wdqs&wdqs=SELECT%20%3Fitem%20%3FitemLabel%20%0AWHERE%20%0A%7B%0A%20%20%3Fitem%20wdt:P31%7Cwdt:P527%7Cwdt:P279%20wd:Q17155032.%0A%20%20SERVICE%20wikibase:label%20%7B%20bd:serviceParam%20wikibase:language%20%22%5BAUTO_LANGUAGE%5D,en%22.%20%7D%0A%7D

Wikidata:Tools/Visualize data/de - Wikidata

Coole Tools.

https://www.wikidata.org/wiki/Wikidata:Tools/Visualize_data/de

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Diese Art der Darstellung suche ich noch:

Vielleicht ist bei den oben angegebenen Links was dabei, habe bisher nur kurz mal reingeschaut.

Sie stammt aus diesem Tutorial:

Linked Data: combining data from Wikidata and Eurostat datasets

http://pyvandenbussche.info/2019/linked-data-combining-data-from-wikidata-and-eurostat-datasets/

Und mit dem daraus mir kopiertem Code (siehe unten!) habe ich folgende Grafiken erzeugt:

Code

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from matplotlib import cm

import rdflib

from SPARQLWrapper import SPARQLWrapper, JSON

import json

#%matplotlib inline

def get_sparql_dataframe(service, query):

"""

Helper function to convert SPARQL results into a Pandas data frame.

"""

sparql = SPARQLWrapper(service)

sparql.setQuery(query)

sparql.setReturnFormat(JSON)

result = sparql.query()

processed_results = json.load(result.response)

cols = processed_results['head']['vars']

out = []

for row in processed_results['results']['bindings']:

item = []

for c in cols:

item.append(row.get(c, {}).get('value'))

out.append(item)

return pd.DataFrame(out, columns=cols)

endpoint = "https://query.wikidata.org/sparql"

query = """

select * {

?country wdt:P31 wd:Q3624078 .

?country rdfs:label ?name .

filter(lang(?name) = 'en')

?country wdt:P2250 ?life_exp .

?country wdt:P2132 ?GDP_per_capita .

?country wdt:P1082 ?population .

?country wdt:P297 ?code .

Optional{?country wdt:P605 ?nuts_code . }

}

"""

countries = get_sparql_dataframe(endpoint, query)

print(countries)

countries = countries.loc[~(countries['nuts_code'].str.len() > 2)]

countries['population'] = countries['population'].astype('float64')

countries['life_exp'] = countries['life_exp'].astype('float64')

countries['GDP_per_capita'] = countries['GDP_per_capita'].astype('float64')

print(countries.head())

# Similar to Hans Rosling presentation

countries['scaled population'] = ((countries['population'] - countries['population'].min())/200000+1).astype(int)

ax = countries.plot.scatter(x='GDP_per_capita', y='life_exp', s=countries['scaled population'],

figsize=(20, 10), cmap=cm.get_cmap('viridis'),c='life_exp', colorbar=False,

alpha=0.6, edgecolor=(0,0,0,.2));

countries[['GDP_per_capita','life_exp','code']].apply(lambda x: ax.text(*x),axis=1);

plt.show()

# Get Eurostat Linked Data through their portal: http://estatwrap.ontologycentral.com/

g=rdflib.Graph()

g.parse('http://estatwrap.ontologycentral.com/data/educ_regind', format='application/rdf+xml')

qres = g.query(

"""

PREFIX sdmx-measure: <http://purl.org/linked-data/sdmx/2009/measure#>

PREFIX dcterms: <http://purl.org/dc/terms/>

PREFIX eus: <http://ontologycentral.com/2009/01/eurostat/ns#>

SELECT ?geo ?value

WHERE {

?obs dcterms:date "2012".

?obs eus:geo ?geo.

?obs eus:indic_ed <http://estatwrap.ontologycentral.com/dic/indic_ed#R05_1> .

?obs sdmx-measure:obsValue ?value .

}""")

# Participation rates of 4-years-olds in education at regional level

eurostat = pd.DataFrame(columns = ['eurostat_country','education_value'], data=[[str(geo), str(value)] for geo, value in qres])

eurostat['nuts_code'] = eurostat['eurostat_country'].str.split('#').str[1]

eurostat['education_value'] = eurostat['education_value'].astype(float)

eurostat.head()

linked_data = eurostat.merge(countries, how='inner', on='nuts_code')

linked_data

linked_data['scaled population'] = linked_data['scaled population']*10

ax2 = linked_data.plot.scatter(x='GDP_per_capita', y='education_value', s=linked_data['scaled population'],

figsize=(20, 10), cmap=cm.get_cmap('viridis'),c='education_value', colorbar=False,

alpha=0.6, edgecolor=(0,0,0,.2));

linked_data[['GDP_per_capita','education_value','nuts_code']].apply(lambda x: ax2.text(*x),axis=1);

plt.show()

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Embedding Projector - Tensorflow

https://projector.tensorflow.org/

..................................................................................

Und überhaupt sehr interessant, dieser Blog:

http://pyvandenbussche.info/

http://pyvandenbussche.info/2019/linked-data-past-present-2019-and-future/
http://pyvandenbussche.info/2017/translating-embeddings-transe/

Dieses Blog durchsuchen

Es gibt ...

Mini-Projekt 2020-03-25: Begriffsbaum mithilfe von SPARQL & Python erstellen - Wikidata-Visualisierungen

Begriffsbaum mithilfe von SPARQL & Python erstellen - Wikidata-Visualisierungen

Nebenherentdeckungen

Wikidata Graph Builder

Wikidata:Tools/Visualize data/de - Wikidata

Linked Data: combining data from Wikidata and Eurostat datasets

Code

Embedding Projector - Tensorflow

Kommentare

Kommentar veröffentlichen

Beliebte Posts aus diesem Blog

Betreuervergütung – Betreuungsrecht-Lexikon

Ich schreibe wie?

Praktisch erledigt.