INDEX
Explanations
words related to intrusion or entry, especially forcefully or deeply
terms related to penetration or infiltration
New Auto-Interp
Negative Logits
rie
-0.78
rious
-0.73
tri
-0.70
farewell
-0.69
bind
-0.66
goodbye
-0.66
rug
-0.66
piece
-0.64
rieve
-0.64
killer
-0.63
POSITIVE LOGITS
penetrated
1.11
penetrate
1.08
penetration
1.02
penet
0.99
infiltrated
0.91
INTO
0.89
deeper
0.86
penetrating
0.83
into
0.83
infiltrate
0.83
Activations Density 0.032%