Skip to content

majidasgari/JHazm

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JHazm

Build and Test Java Maven Central License

A Java version of Hazm (Python library for digesting Persian text)

  • Text cleaning
  • Sentence and word tokenizer
  • Word lemmatizer
  • POS tagger
  • Dependency parser
  • Corpus readers for Hamshahri and Bijankhan

Dependencies

You must install this module with maven.

Installation

Using as Maven Dependency

Add this dependency to your pom.xml:

<dependency>
    <groupId>ir.ac.iust.nlp</groupId>
    <artifactId>jhazm</artifactId>
    <version>1.0.0</version>
</dependency>

Note: If the artifact is not available in Maven Central, you can use JitPack:

  1. Add JitPack repository to your pom.xml:
<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>
  1. Add the dependency:
<dependency>
    <groupId>com.github.majidasgari</groupId>
    <artifactId>JHazm</artifactId>
    <version>master-SNAPSHOT</version>
</dependency>

Building from Source

For using this project as library in maven just use:

mvn clean install

Running the JAR

To make a single jar file run this command:

mvn clean compile assembly:single

To run and see the help:

java -jar target/jhazm-jar-with-dependencies.jar

For example to do POS Tag on bundled sample file use:

java -jar target/jhazm-jar-with-dependencies.jar -a partOfSpeechTagging -o test.txt

Or to run on any other file:

java -jar target/jhazm-jar-with-dependencies.jar -a partOfSpeechTagging -o test.txt -i input.txt

Or on some piece of text:

java -jar target/jhazm-jar-with-dependencies.jar -a partOfSpeechTagging -o test.txt -t "سلام من خوب هستم!"

Requirements

  • Java 21 or higher
  • Maven 3.6+

Good Luck!

About

A Java version of Hazm (Python library for digesting Persian text)

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Java 100.0%