INDEX
Explanations
references to the original version of something
occurrences of the word "original."
New Auto-Interp
Negative Logits
··
-0.65
ebin
-0.65
doi
-0.64
leground
-0.64
trap
-0.63
watching
-0.62
lies
-0.62
heid
-0.62
zone
-0.62
ries
-0.61
POSITIVE LOGITS
original
3.47
originals
2.71
original
2.51
Original
2.23
Original
2.21
ORIG
1.67
originally
1.55
initial
1.48
previous
1.27
actual
1.21
Activations Density 0.020%