INDEX
Explanations
words related to positive and negative experiences or events
statements about positive and negative experiences or outcomes
New Auto-Interp
Negative Logits
EVA
-0.73
Footnote
-0.72
ب
-0.71
Britain
-0.70
Pub
-0.66
ت
-0.66
Portuguese
-0.65
Geneva
-0.64
Ùĩ
-0.63
ART
-0.63
POSITIVE LOGITS
ourced
1.03
aic
1.02
ided
1.01
ides
0.98
cale
0.96
undown
0.95
sembly
0.95
icum
0.93
ourcing
0.93
ugar
0.92
Activations Density 0.023%