INDEX
Explanations
references to individuals or entities in political or bureaucratic contexts
New Auto-Interp
Negative Logits
sice
-0.18
elin
-0.16
Fixture
-0.15
ково
-0.15
léd
-0.14
oret
-0.14
EIF
-0.13
ãģ«ãĤĪ
-0.13
ICI
-0.13
_vs
-0.13
POSITIVE LOGITS
nevertheless
0.26
nonetheless
0.25
Nevertheless
0.19
yet
0.18
åį»
0.18
åį´
0.17
toch
0.17
seems
0.17
Nevertheless
0.16
yet
0.16
Activations Density 0.417%