INDEX
Explanations
references to personal experiences and emotional honesty
New Auto-Interp
Negative Logits
Presumably
-0.57
doubtless
-0.54
presumably
-0.53
...>
-0.52
nearly
-0.50
Szer
-0.49
forse
-0.49
presumably
-0.49
-0.49
wry
-0.48
POSITIVE LOGITS
🏾
0.92
smh
0.80
niggas
0.78
Normdatei
0.77
AlterField
0.77
nigga
0.76
ain
0.75
adaptiveStyles
0.71
dope
0.68
tryna
0.68
Activations Density 0.226%