Package org.apache.tika.language.detect
Class LanguageResult
java.lang.Object
org.apache.tika.language.detect.LanguageResult
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionLanguageResult(String language, LanguageConfidence confidence, float rawScore) LanguageResult(String language, LanguageConfidence confidence, float rawScore, float confidenceScore) -
Method Summary
Modifier and TypeMethodDescriptionfloatDetector-agnostic confidence score (0.0 to 1.0).The ISO 639-1 language code (plus optional country code)floatbooleanisLanguage(String language) Return true if the target language matches the detected language.booleanbooleantoString()
-
Field Details
-
NULL
-
-
Constructor Details
-
LanguageResult
- Parameters:
language- ISO 639-1 language code (plus optional country code)rawScore- confidence of detector in the result.
-
LanguageResult
public LanguageResult(String language, LanguageConfidence confidence, float rawScore, float confidenceScore) - Parameters:
language- ISO 639-1 language code (plus optional country code)rawScore- detector-specific score (e.g., softmax probability)confidenceScore- detector-agnostic confidence (0.0 to 1.0, higher = more confident). For comparing results across different decodings or detectors.
-
-
Method Details
-
getLanguage
The ISO 639-1 language code (plus optional country code)- Returns:
- a string representation of the language code
-
getRawScore
public float getRawScore() -
getConfidenceScore
public float getConfidenceScore()Detector-agnostic confidence score (0.0 to 1.0). Higher values indicate the detector is more confident in the result. This can be used to compare results across different text decodings (e.g., for encoding detection) without knowing the detector implementation. -
getConfidence
-
isReasonablyCertain
public boolean isReasonablyCertain() -
isUnknown
public boolean isUnknown() -
isLanguage
Return true if the target language matches the detected language. We consider it a match if, for the precision requested or detected, it matches. This means:target | detected | match? zh | en | false zh | zh | true zh | zh-CN | true zh-CN | zh | true zh-CN | zh-TW | false zh-CN | zh-cn | true (case-insensitive)
- Parameters:
language-- Returns:
-
toString
-