INDEX
Explanations
superlative and frequency indicators
New Auto-Interp
Negative Logits
Very
0.46
更加
0.45
More
0.43
alen
0.42
更
0.42
是你
0.41
很是
0.41
Ни
0.40
아가
0.39
Details
0.38
POSITIVE LOGITS
important
0.66
likely
0.64
prevalent
0.63
healthiest
0.61
frequent
0.59
probable
0.57
likely
0.56
commonly
0.55
important
0.55
strongest
0.54
Activations Density 0.023%