INDEX
Negative Logits
purpoſe
-0.71
pleaſure
-0.68
ſelves
-0.67
cauſe
-0.65
İY
-0.63
ſta
-0.61
itſelf
-0.60
fubject
-0.60
gani
-0.60
ſtate
-0.60
POSITIVE LOGITS
like
1.50
such
1.17
вроде
0.91
such
0.89
seperti
0.84
như
0.83
مثل
0.79
zoals
0.78
如
0.77
like
0.77
Activations Density 0.001%