INDEX
Explanations
phrases related to actions or behaviors attributed to specific individuals
phrases that indicate involvement or responsibility
New Auto-Interp
Negative Logits
berus
-0.67
plum
-0.65
grandchildren
-0.64
millenn
-0.63
kefeller
-0.62
ricane
-0.62
pods
-0.62
basil
-0.59
Reincarn
-0.59
broom
-0.59
POSITIVE LOGITS
offs
0.74
uary
0.72
allo
0.66
..................
0.66
guiActiveUn
0.64
displayText
0.62
aque
0.62
aking
0.61
lab
0.60
meal
0.60
Activations Density 0.014%