INDEX
Explanations
references to opinions and assessments of expertise
New Auto-Interp
Negative Logits
olle
-0.14
>(*
-0.14
ysters
-0.14
loh
-0.14
licts
-0.14
ETY
-0.14
orny
-0.14
ennen
-0.13
acker
-0.13
ond
-0.13
POSITIVE LOGITS
opinion
0.77
opinions
0.73
opin
0.63
Opinion
0.61
views
0.48
Op
0.47
æĦıè§ģ
0.42
thoughts
0.41
Op
0.40
Views
0.39
Activations Density 0.300%