INDEX
Explanations
phrases indicating frequency or estimation of events or conditions
New Auto-Interp
Negative Logits
lington
-0.15
rade
-0.15
ombat
-0.15
onde
-0.14
appName
-0.14
ombre
-0.14
วย
-0.14
retim
-0.14
doma
-0.14
anches
-0.13
POSITIVE LOGITS
considered
0.28
thought
0.26
regarded
0.26
described
0.23
treated
0.23
regard
0.22
viewed
0.22
referred
0.21
said
0.20
consider
0.19
Activations Density 0.189%