INDEX
Explanations
strings within parentheses
punctuation marks that denote the end of thoughts or statements
New Auto-Interp
Negative Logits
het
-0.87
houses
-0.83
arest
-0.73
gard
-0.72
gart
-0.70
ihu
-0.68
rals
-0.66
ton
-0.65
shaw
-0.65
stones
-0.65
POSITIVE LOGITS
terday
0.88
/"
0.84
"))
0.73
ONSORED
0.73
":[
0.70
"],
0.70
},{"0.69
");
0.68
fixme
0.68
CLAIM
0.67
Activations Density 0.011%