scala version should be compatible for the system and the Spark
spark 1.3.1
scala 2.10.4
hadoop 1.2.1
titan 1.0.0
Notice 2:
assume lib in simba home contains following libs
hadoop-client-1.2.1.jar hadoop-gremlin-3.0.1-incubating.jar hbase-common-0.98.2-hadoop1.jar htrace-core-2.04.jar
hadoop-core-1.2.1.jar hbase-client-0.98.2-hadoop1.jar hbase-protocol-0.98.2-hadoop1.jar
or you need to include these libs through modifying the build.sbt
Notice 3: (for titan)
conf contains “conf/titan-hbase-es-simba.properties” configuration file for TitanDB(hbase+es in default)
test_input contains the docs and links data and can be accessed as
val docRDD = sc.objectFileDocument
val linkRDD = sc.objectFileDocumentLink
compile
sbt clean compile
run
sbt run
test
sbt test
Simple Example:
var gDB = TitanSimbaDB(sc, titanConf)
val docRDD = sc.objectFileDocument
gDB.insert(docRDD)
gDB.docs().foreach(s => s.simbaPrint())
gDB.close()
simba
insert, extraction and analysis framework for LDM
Notice 1:
scala version should be compatible for the system and the Spark
Notice 2:
assume lib in simba home contains following libs hadoop-client-1.2.1.jar
hadoop-gremlin-3.0.1-incubating.jar
hbase-common-0.98.2-hadoop1.jar
htrace-core-2.04.jar hadoop-core-1.2.1.jar
hbase-client-0.98.2-hadoop1.jar
hbase-protocol-0.98.2-hadoop1.jar or you need to include these libs through modifying the build.sbt
Notice 3: (for titan)
compile
sbt clean compile
run
sbt run
test
sbt test
Simple Example:
var gDB = TitanSimbaDB(sc, titanConf) val docRDD = sc.objectFileDocument gDB.insert(docRDD) gDB.docs().foreach(s => s.simbaPrint()) gDB.close()