INDEX
Explanations
phrases related to actions or tasks being done or considered
verbs that denote actions related to discussion and processing of topics
New Auto-Interp
Negative Logits
Defenders
-0.66
utra
-0.62
urai
-0.61
Balance
-0.60
acion
-0.59
communications
-0.59
ultz
-0.58
asio
-0.58
adan
-0.57
riot
-0.57
POSITIVE LOGITS
own
0.93
ĸļ
0.83
oing
0.80
nesday
0.77
gling
0.75
uled
0.74
ired
0.73
UCHIJ
0.72
escription
0.71
Alive
0.70
Activations Density 0.264%