Den semantiske geoweb II

(Artikelserie Dit og mit kort, 68). Jeg viste i sidste uge 3 eksempler på REST og SPARQL søgninger ind i LinkedGeoData projektet. Jeg har nu oprettet en webside med 11 forskellige eksempler på REST og SPARQL søgninger. Så fx vis alle cykelhandlere på Vesterbro inden i den geografiske firkant (bbox=12.529400,55.662703,12.569705,55.674555) vil se sådan her ud

http://linkedgeodata.org/page/near/55.662703-55.674555,12.529400-12.569705/class/BicycleShop

(Du kan prøve ovenstående søgning her og som noget nyt kan du nu få det vist direkte i – rdf/xml formatet.

Du kan selv prøve at ændre bredde- og længdegraderne i de viste eksempler på websiden og så prøve det af. Du kan benytte mit bredde- og længdegrads værktøj til at finde disse koordinater for det område du selv vil prøve af.

Den semantiske geoweb

(Artikelserie Dit og mit kort, 65). Du har sikkert hørt om det semantiske web snakken, og måske også om Linked Data projektet, hvor man på kryds og tværs af datasæt linker informationer sammen for at få en dybere forståelse og mening af informationerne på WWW. Der er også et søsterprojekt, hvor det så er geodata, der er i fokus. Projektet hedder LinkedGeoData som kort sagt er

LinkedGeoData is an effort to add a spatial dimension to the Web of Data / Semantic Web. LinkedGeoData uses the information collected by the OpenStreetMap project and makes it available as an RDF knowledge base according to the Linked Data principles. It interlinks this data with other knowledge bases in the Linking Open Data initiative. [...]It currently consists of information about approx. 350 million nodes and 30 million ways and the resulting RDF data comprises approximately 2 billion triples.

Nu er LinkedGeoData brug af OpenStreetMap interessant, da dette RDF geodatasæt er klart det største på WWW, som er helt frit tilgængeligt – Størrelsen er målt på antal af unikke objekter (mange millioner flere end hvad Wikipedia rummer).

Hvad skal vi så bruge disse semantiske eksperimenter med geodata til? Vi skal først og fremmest bruge det til, at blive klogere på vores omverden geografisk. Alt dette illustreres bedst ved at vise nogle REST søgninger ind i LinkedGeoData.

Første eksempel : vis alle værtshuse i en radius af 500 meter fra Københavns Hovedbanegård, det gøres med følgende

http://linkedgeodata.org/page/near/55.6720669276334,12.565441131591797/500/class/Pub
(Du kan selv prøve ovenstående søgning her)

Nu får vi en liste tilbage (i HTML) af steder samt bredde- og længdegrader, der opfylder ovenstående søgeargument. Man kan dernæst vælge at få listen i formater som rdf/xml, n-triples, turtle og n3. Det er formater som er kendte af folk, der arbejder og interesserer sig for den semantiske web kender (Jeg vil ikke uddybe nærmere i dette indlæg)

Andet eksempel : Vis alle pengeautomater (eng: ATM) i en radius af 1000 meter fra Vesterbro Torv. Søgeargument

http://linkedgeodata.org/page/near/55.67227263725309,12.55436897277/1000/class/Atm
(Du kan selv prøve ovenstående søgning her)

Igen får vi en liste tilbage der opfylder ovenstående betingelser.

Sidste eksempel for i dag – Vi vil her lave en avanceret søgning ved brug af OpenLink Virtuoso SPARQL Query værktøjet (SPARQL er en syntaks for søgninger i RDF dokumenter). Vis alle religiøse steder (kirker, moskeer, synagoger o.s.v.) i en radius på 5 km fra Vesterbro Torv, find også alle værtshuse der befinder sig i en radius på max. 200 meter fra disse religiøse steder og giv os en liste over dette.

PREFIX lgdo: <http://linkedgeodata.org/ontology/>
SELECT ?placeofworshipname ?placeofworshipgeo ?pubname ?pubgeo
FROM <http://linkedgeodata.org>
WHERE {
?placeofworship a lgdo:PlaceOfWorship .
?placeofworship geo:geometry ?placeofworshipgeo .
?placeofworship rdfs:label ?placeofworshipname .

?pub a lgdo:Pub .
?pub geo:geometry ?pubgeo .
?pub rdfs:label ?pubname .

FILTER(
bif:st_intersects (?placeofworshipgeo, bif:st_point (12.55436897277, 55.67227263725309), 5) &&
bif:st_intersects (?pubgeo, ?placeofworshipgeo, 0.2)
) .
}

(Du kan selv prøve ovenstående søgning her)

Igen får vi en liste tilbage der opfylder ovenstående betingelser.

I OpenLink Virtuoso SPARQL Query kan også vælges, hvilke formater man vil have tingene retur i

Nu bliver output i ovenstående eksempler ikke bedre end de geodata som allerede findes i OpenStreetMap. Hvilket så er en god grund til, at du kan blive frivillig i OpenStreetMap projektet. Kort fortalt hver gang du smider et eller andet ind i OpenStreetMap, så er du faktisk også med til at fodre den semantiske geoweb.

Hvis du interesserer dig for emnet den semantiske web, så afholder DONA sammen med Dagbladet Information eventen “Nyheder i kontekst: Introduktion af Tagger” d. 20 juni 2011, hvor de førende eksperter i Danmark vedr. emnet det semantiske web dukker op og fortæller.

Drupal, RDF og andet semantisk stof

Dries Buytaert gav en keynote på Drupalcon Boston 2008. Der et lavet et video mashup baseret på hans tale om semantiske værktøjer til Drupal. Du vil høre noget om nogle moduler til Drupal fx modul til RDF

, SPARQL og MIT Simile Exhibit. Læs hele transcript her (lidt nede) om, hvad der egentlig sker i videodemoen. Det er ikke småting, der bliver rørt og kogt sammen i den semantiske gryde.

[googlevideo]http://video.google.com/videoplay?docid=8487255297768440860[/googlevideo]
(direkte video link for stor udgave)

Jeg har flere gange selv haft fornøjelsen af MIT Simile Exhibit 2.0 værktøjet til visualisering af data (fx 1,2,3,4)

RDFa for begyndere

notizBlog har gravet en god begyndervideo frem vedr. RDFa. RDFa benyttes i XHML til at tilføje mere semantisk beskrivelser ved hjælp af triples mm. Ikke så meget teknisk snak, videoen forklarer det hele meget bedre. Se den 9 min. introduktionsvideo og blev klogere på konceptet.

[youtube]http://www.youtube.com/watch?v=ldl0m-5zLz4[/youtube]

Sæt RDF i SCRuF banken

Jeg har leget lidt rundt med SCRuF, som er

SCRuF means to scrape Microformats. Or more precisely Its a GRDDL “like” application, that Scrapes Microformats from XHTML via XSL transitions and saves the transitions in a Store as RDF and XML

Okay, sort snak årets sidste dag. Så vi tager et eksempel. Min blogroll her på microformats.dk benytter sig af microformattet XFN. XFN bruges til at beskrive relationerne til dem jeg har linket til i min blogroll. Med SCRuF Masher indsætter jeg linket til mit domæne og indsætter SCRuF XSL (Extensible Stylesheet Language) til FOAF (Friend of a Friend) format nedenfor.

SCRuF masher

FOAF er et RDF (XML) format som også kan bruges til at beskrive relationer og meget mere. Et klik på “Transform” og jeg har omdannet mine XFN til FOAF. Se resultat her.

SCRuF indsætter nu i deres SCRuF Scrape Store

min FOAF fil i en mappe (angivet ved domænenavn).

Min SCRuF Scrape Store

Klik mappen og dernæst på “More” og du kan fx se mine RDF data i SIOC RDF Browser, W3C RDF Validation Service eller bare i rå RDF format. Andet eksempel fra min side er alle mine links som er omformet til en RDF med Atom link syntaks.

Hvad kan dette så bruges til? SCRuF er eksempel på hvordan data (her microformats og links) kan trækkes ud fra almindelige XHTML sider og omformes til RDF (XML) på en let måde. Disse data kan så genbruges af andre, da RDF er simpelt og nemt at udveksle data i. Rent praktisk her og nu kan det nok ikke bruges til noget for de fleste webmastere, men det giver fingerpeg om hvordan det såkaldte semantiske web arbejder sig hen imod.

Lige et par noter om SCRuF ellers virker det ikke – Du skal benytte XHTML og samtidigt skal XHTML være valid (overholde syntaks). Og så skal du helst også benytte UTF-8 tegnsæt, da SCRuF transformerer til RDF i UTF-8 (for at undgå underlige tegn i output). Men, hvis du er seriøs webmaster arbejder du vel allerede med ovennævnte ting? – av av der stak jeg hovedet i løvens gab!

Opdatering (14:05) – Bemærk også at du ikke kan slette en mappe/fil du har lagt op på SCRuF. Der skal du sende en mail p.t. Holdet bag SCRuF arbejder på en OpenID løsning, så du kan betjene dig selv.

Godt nytår!

SCRuF fundet via microformatique

Exclusive interview with Brian Suda

I am very happy and proud to present an exclusive interview with Brian Suda. Brian is one of the veterans and still going strong member of the microformats community. Brian is co-author of the hCard specification. Brian is also the developer the X2V and GEO Microformats to XML tool. Some handy and cool tools you can use a lot for your microformats projects.

[1] Søren: Welcome Brian. The microformats community quite often use this phrase “Designed for humans first and machines second”. How would you explain this concept for web developers and ordinary web users,who never have heard about microformats?

Brian: I like to talk about microformats as “semantic sugar”. Everyone can understand that adding a little bit of sugar to your food you make it taste better. Adding microformats into your HTML makes it “taste” a little better too!

The “Designed for humans first and machines second” is attempting to point out that microformats always take into consideration the publisher first. This means that things should be as easy as possible for the person writing the HTML. There will be factors of 10 more people publishing than writing parsers, so why make it easy to parse at the expense of the publishers? The other big thing this stresses is that data should be for humans, it should be in plain view – you should see it every day through the window of your web browser. Data that is only for machines tends not to be visible to humans in a meaningful way, so we forget about it, we never update it and next thing you know it is completely wrong!

[2]Søren: You are one of the co-authors of hCard, where did the genius idea came from regarding reusing the old vCard specification for hCard?

Brian: I already knew that my target was going to be vCard, so for me i was simply creating a vCard in HTML rather than XML or plain-text. Basically, a good programmer is lazy. It is always a good idea never to re-invent the wheel when ever possible. Using vCard properties for class values was a logical choice, with applications already supporting it, you instantly get inter-operability.

[3] Søren: First time I did a hCard and then pointed the URN for that web page with my hCard to your “GEO Microformats to XML” tool and a KML file started up my Google Earth, I was very impressed and felt that this were one of the more practical things to do with microformats. Later on, I have discovered that when showing new people microformats in action (on a FireFox browser with Operator) – Examples with hCard and maps (Google Maps, Yahoo Maps, Google Earth etc.) is the first thing they think is smart and usefully for them self. Do you have the same experience when you are talking at conferences, that examples with maps draws peoples attentions to microformats?

Brian: Certainly maps are something everyone can relate too. In my presentations I try to show at least one practical and one “far out” demonstration of microformats. Usually, i demo how to take an HTML page, upcoming.org or any other with an hCalendar, then convert that to an iCalendar file.

With the newest versions of Outlook and Apple’s iCal, you can “subscribe” to events. This means that as the HTML pages are updated, your calendar application gets those updates too. This tends to really impress an audience, because we have all probably missed a rescheduled meeting or event due to a rescheduling. HCalendar really scratches an itch that people have every day.

I also like to demo some crazy far out stuff too, just to get people’s minds thinking. Twitter, for example, marks up all your messages as hAtom entries. Each of these entries has a publication date, so there is no reason why it isn’t possible to extract the data and convert it to XML or JSON and have it loads into a timeline or other software. Now we can begin to see twitter posts in relation to each others time distance rather than just as a list.

Timeline from  a twitter feed

[4] Søren: Speaking about the upcoming FireFox 3.0 (not talking about the version out now for testers) and the build in microformats detection feature – will this be the breaking point and the big step ahead for microformats? – so we maybe see a success for microformats like the one RSS/Atom have gone through?

Brian: I think having a microformats detection native to the browser will be a big benefit to adoption and awareness. RSS/Atom has exploded for lots of different reasons, but it has taken many, many years! Where I personally see microformats in the browser excelling, is on mobile devices. Imagine if the browser in your phone was microformats aware. Instead of trying to re-type an event using T9, it could be one-click, save to calendar, one-click, call this person, one-click, get directions to this place from where i am standing right now based on my lat/lon of my phone’s built in GPS unit.

If all things were equal between two websites, but one used microformats and the other didn’t, when on my phone i bet you know the one i’d pick!

For better or worse, the end-user doesn’t really care about microformats. If you look at the current Operator toolbar, it doesn’t mention the word “microformats” at all. It is all “action” based. Which i think is good. My parents don’t need to know what microformats are to be able to “save to address book”. The better a technology is, the less of it you see. To most people, FireFox 3 knowing that there are 3 events in a page will just be magic. To any good web developer,they will want to know “how do i get my pages to have those options appear” and will learn more about microformats.

My dream would be that microformats become so ubiquitous that you don´t need to announce that they are in your page, it is just expected. Much like making valid HTML, you shouldn’t be proud and announce to the world “My HTML validates”, because that should be a baseline. It is like telling the world “I brushed my teeth this morning”, so what – i hope you did. If we evangelize enough, microformats will just become part of the HTML you produce on a daily basis.

[5]Søren: I maintain a list over danish web sites which is using microformats (The list does not incl. “rel-tag” web sites). The list is very short. For me it seems like microformats is very unknown in Denmark at the moment. Do you think that microformats can have a some language barrier? I am thinking about ‘classes’ for example are all in english [ like class="street-address" i hCard etc.] So a danish, swedish or finish web developer might thinking what kind of benefits will I get from using english ‘class’ names in my markup? So will we end up with microformats in small countries/languages is only something a few hard core techies is doing?

Brian: Microformats are small re-usable pieces of information, so I´d hope there isn´t much to remember and the barrier to entry is low. HTML is already (for better or worse) in English, so you need to understand what a <p> and <strong> mean relative to Danish. This learning comes from information in your own language, but what you write in is ultimate English. This is why it is important for sites like microformats.dk, microform.at, microformats.biz and others to be localized so more people can learn about what class="street-address" means in their own language and culture, just like you needed to learn what <em> meant.

I don’t think this is a huge drawback for the interoperability. Everything is a trade-off so a unified language for describing things makes it easier to adopt globally so we all know we are talking about the same things. The people I feel sorry for are the British English speakers, it is their native language, but to them all the spelling is wrong!

[6] Søren: You have been an invited expert for the GRDDL working group at W3C. How does the folks at W3C looking at microformats in the picture of creating a more Semantic Web? I have seen that Tim Berners-Lee is quite positive about microformats – (like this twitter message from Tantek Çelik “Tim Berners-Lee just called the microformats wiki “a special holy place :) ”)

Brian: The W3C is a big organization, but all the people that i have met like microformats – both as an idea and a technology. Their main concern is that microformats can not solve everything, they only cover popular aspects such as People, Places, Events, Reviews and a few others. Whereas with W3C technologies, such as RDF a you can just about describe anything you want, but as with anything there are trade-offs.

Up until a few years ago, there were only two options, HTML and RDF. Two pretty far ends of the spectrum. HTML was for the browser and human eyes, whereas RDF was for machines (and can hurt human eyes if you look at it!).

In recent years we are filling in that spectrum between HTML and RDF. We have the more complex, but can describe anything markup in RDFa or eRDF, and the more lightweight microformats that are easier to implement but have limitations. Then there is POSH and GRDDL to also add and extract semantics. We now have more choices and can select the best tool for the best job on any given project.

I think, from the people i have talked too, that everyone agrees anything which gives more meaning to the web is a good thing. If it has enough structure, then it can be converted to other formats, RDF, RSS, Atom, etc. so that existing tools that people are familiar with can use, understand and act on the data. Microformats do this extraordinary well for very little costs, so it is very much a positive thing for the Semantic Web.

[7] Søren: At the moment you are quite interested in OpenID working together with microformats. Can you tell a little about what the idea is behind all this?

Brian: sure, OpenID is away to authenticate yourself and prove you are who you say you are. Microformats can work with this to further describe more about yourself. For instance, i have a profile page at claimid.com/briansuda with an hCard and i can say that is me, but you just have to take my word for it. ClaimID is also my OpenID provider, so i can also prove that page is me by passing an OpenID challenge response system (username and password). I should be the only one who can answer that username/password so you can trust that i control that page which i am claiming is me. It is a verified way to trust the microformatted data.

I also like using OpenID for lots of other stuff too. Friends of mine have blogs, but don´t want to list a full hCard to the general public. So for the world, you can get their hCard with only an FN and country-name, but for friends to authenticate themselves with OpenID they can see a full hCard with email, tel, adr, etc. So OpenID is a way to white-list friends to sensitive data.

OpenID is a really interesting open technology which compliments microformats well.

I thank Brian Suda for taking his time to answer some questions here at microformats.dk. I really liked the word “semantic sugar” Brian used above and will used from now on (in danish translation). If some danish readers out there are interested in microformats, we maybe could start a danish barcamp etc. Please feel free to contact me regarding microformats.

Some reading stuff from Brian Suda

Læsetip – microformats

James Simmons har skrevet om “Microformats vs. RDF: How Microformats Relate to the Semantic Web

Matt Mullenwegs firma Automattic har købt Gravatar. Gravatar (globally recognized avatar) er en webservice, hvor du får en 80×80 pixel avatar, som du kan bruge for at tilkende give dig for omverden på blogs, i kommentarfelter udover dit eget. Hvad har det så med microformats at gøre? Matt nævner i presseomtalen af købet

Allow Gravatar profile pages with Microformat support for things like XFN rel=”me” and hCard.

Det er nok ikke så underligt at XFN (XHTML Friends Network) bliver standard, da Matt sammen med Eric Meyer og Tantek Çelik i 2003 fandt på microformattet XFN som en simpel måde at beskrive relationer mellem blogs.

Hvis du er WordPress kørende så ser du under “Blogroll” at XFN er muligt at benytte , når du tilføjer et link til din blogroll. Der er forskellige mulligheder for at beskrive den blog du linker til, fx er den linkede blog fra en “Ven, bekendt, familie, kollega, dig osv.”

WordPress funktionen med XFN