INDEX
Explanations
no specific significant activations, indicating it does not detect any particular feature or information in the given document
New Auto-Interp
Negative Logits
<eos>
-0.69
“
-0.63
that
-0.61
new
-0.60
and
-0.60
this
-0.58
make
-0.57
big
-0.56
full
-0.56
tõ
-0.55
POSITIVE LOGITS
AnimationsModule
0.75
EndProject
0.67
Himo
0.66
</caption>
0.65
piram
0.65
InstrumentedTest
0.64
disambiguazione
0.64
IsMutable
0.63
SourceChecksum
0.62
Chwiliwch
0.62
Activations Density 0.170%