INDEX
Explanations
the word "dd" followed by a number, likely indicating a specific identifier or code within the text
occurrences of the abbreviation "dd" or similar patterns
New Auto-Interp
Negative Logits
awaru
-0.75
atis
-0.74
helle
-0.68
Stras
-0.68
urity
-0.67
framework
-0.65
membr
-0.62
spoiler
-0.61
addons
-0.59
auga
-0.58
POSITIVE LOGITS
ragon
1.08
itional
1.08
ouble
1.07
iamond
1.06
ressing
1.05
orf
1.03
etermin
0.98
irect
0.97
aughter
0.95
iscover
0.91
Activations Density 0.025%