目录
目录README.md

simba

insert, extraction and analysis framework for LDM

Notice 1:

scala version should be compatible for the system and the Spark

  1. spark 1.3.1
  2. scala 2.10.4
  3. hadoop 1.2.1
  4. titan 1.0.0

Notice 2:

assume lib in simba home contains following libs hadoop-client-1.2.1.jar
hadoop-gremlin-3.0.1-incubating.jar
hbase-common-0.98.2-hadoop1.jar
htrace-core-2.04.jar hadoop-core-1.2.1.jar
hbase-client-0.98.2-hadoop1.jar
hbase-protocol-0.98.2-hadoop1.jar or you need to include these libs through modifying the build.sbt

Notice 3: (for titan)

  1. conf contains “conf/titan-hbase-es-simba.properties” configuration file for TitanDB(hbase+es in default)
  2. test_input contains the docs and links data and can be accessed as val docRDD = sc.objectFileDocument val linkRDD = sc.objectFileDocumentLink

compile

sbt clean compile

run

sbt run

test

sbt test

Simple Example:

var gDB = TitanSimbaDB(sc, titanConf) val docRDD = sc.objectFileDocument gDB.insert(docRDD) gDB.docs().foreach(s => s.simbaPrint()) gDB.close()

关于

LDM的插入,提取和分析框架。

63.0 KB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

©Copyright 2023 CCF 开源发展委员会
Powered by Trustie& IntelliDE 京ICP备13000930号