INDEX
Negative Logits
\\
-0.67
ATB
-0.58
costa
-0.57
Rana
-0.56
Garvey
-0.56
IALES
-0.53
Picchu
-0.53
den
-0.52
nar
-0.52
andez
-0.52
POSITIVE LOGITS
ſeveral
0.99
avoient
0.97
whoſe
0.96
étoient
0.94
purpoſe
0.93
Efq
0.93
uſe
0.92
étoit
0.92
cauſe
0.91
quæ
0.89
Activations Density 0.005%