INDEX
Explanations
phrases related to expressing opinions or adding commentary in a discussion
phrases indicating statements or speech
New Auto-Interp
Negative Logits
Tud
-0.88
Titanic
-0.85
idine
-0.84
Tunis
-0.83
adan
-0.81
isons
-0.80
istani
-0.80
aban
-0.80
adi
-0.80
pak
-0.77
POSITIVE LOGITS
Gr
2.35
Gr
2.29
gr
2.01
gr
1.96
GR
1.85
Griff
1.69
Griffin
1.65
GR
1.64
Gro
1.56
Gro
1.52
Activations Density 0.339%