INDEX
Explanations
numbers and abbreviations ending with 'amp'
New Auto-Interp
Negative Logits
SHIP
-0.97
bis
-0.89
BLIC
-0.86
ALLY
-0.79
ryu
-0.75
DOM
-0.74
fram
-0.73
OH
-0.73
sv
-0.72
âĸ¬âĸ¬
-0.72
POSITIVE LOGITS
hetamine
1.06
shire
1.03
stead
0.97
acus
0.95
owered
0.94
aic
0.94
ierre
0.94
odcast
0.93
agne
0.92
reys
0.90
Activations Density 0.894%