INDEX
Explanations
phrases involving actions performed or decisions made by individuals
verbs related to actions and contributions
New Auto-Interp
Negative Logits
rep
-0.60
herry
-0.59
Metatron
-0.57
Kle
-0.57
impe
-0.56
Patri
-0.55
Fool
-0.54
Kru
-0.54
Ner
-0.54
ocide
-0.53
POSITIVE LOGITS
abouts
0.78
during
0.69
rentices
0.65
uggle
0.64
*/(
0.64
ãĥ¼ãĤ¯
0.64
beforehand
0.63
alian
0.62
ulus
0.61
anas
0.60
Activations Density 0.281%