Post details: NQuad parsing using Jython
2011-02-21
NQuad parsing using Jython
When in need to parse NQuad RDF files (e.g., the Billion Triples Challenge data files) Java folks can use the NxParser by Aidan Hogan and Andreas Harth: NxParser - Parser for NTriples, NQuads, and more.
You can also use it from Python (provided you use the Jython implementation).
Code:
import sys | |
sys.path.append("./nxparser.jar") | |
| |
from org.semanticweb.yars.nx.parser import * | |
from java.io import FileInputStream | |
from java.util.zip import GZIPInputStream | |
| |
def all_triples(fname, use_gzip=False): | |
in_file = FileInputStream(fname) | |
if use_gzip: | |
in_file = GZIPInputStream(in_file) | |
| |
nxp = NxParser(in_file, False) | |
| |
while nxp.hasNext(): | |
triple = nxp.next() | |
n3 = ([i.toN3() for i in triple]) | |
yield n3 |
The code above defines a generator function which will yield a stream of NQuad records. We can now add some demo code in order to see it in action:
Code:
def main(): | |
gzfname = "sioc-btc-2009.gz" | |
| |
for line in all_triples(gzfname, use_gzip=True): | |
print line | |
| |
if __name__ == "__main__": | |
main() |
Code:
[u'<http://2008.blogtalk.net/node/29>', u'<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>', u'<http://rdfs.org/sioc/ns#Post>', u'<http://2008.blogtalk.net/sioc/node/29>'] | |
| |
[u'<http://2008.blogtalk.net/node/65>', u'<http://rdfs.org/sioc/ns#content>', u'"We\'ve created a map showing the main places of interest (event locations, restaurants, pubs, shopping locations and tourist sights) during BlogTalk 2008. The conference venue is shown on the left-hand side of the map. We will also have a hardcopy for all attendees. View Larger Map"', u'<http://2008.blogtalk.net/sioc/node/65>'] |
Notes:
- I was using this code to parse BTC data files.
- NQuad parsing was recently added to Redland.
Comments, Pingbacks:
No Comments/Pingbacks for this post yet...
This post has 20 feedbacks awaiting moderation...
Leave a comment:
| Mon | Tue | Wed | Thu | Fri | Sat | Sun |
|---|---|---|---|---|---|---|
| << < | > >> | |||||
| 1 | 2 | 3 | 4 | 5 | ||
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| 27 | 28 | 29 | ||||
Search
Gallery
www.flickr.com
|
Categories
Archives
- February 2011 (1)
- September 2010 (1)
- October 2009 (1)
- March 2009 (2)
- February 2009 (4)
- January 2009 (2)
- December 2008 (2)
- November 2008 (5)
- October 2008 (10)
- August 2008 (1)
- July 2008 (4)
- June 2008 (1)
- More...
- more...

