INDEX
Explanations
events or actions that occur prior to a significant moment or change
New Auto-Interp
Negative Logits
GTCX
-0.40
eningrad
-0.40
Савезне
-0.39
analog
-0.37
nền
-0.36
оригіналу
-0.36
Analog
-0.34
ffilmiau
-0.34
Související
-0.34
throughout
-0.34
POSITIVE LOGITS
trane
0.50
kaarangay
0.49
voordat
0.47
bevor
0.47
before
0.47
gridx
0.45
Before
0.44
Before
0.44
IsContent
0.42
rably
0.42
Activations Density 0.432%