Skip to content

Commit

Permalink
ARM.Identify: wrap block classifiers to maintain PSTATE_T
Browse files Browse the repository at this point in the history
This changes the final result of all block classifiers for
ARM to ensure two properties:
  * The value of PSTATE_T is always considered unchanged
  within the same function. i.e. the value of PSTATE_T
  is always restored to its initial value for intra-block
  jumps. This ensures when calling a thumb function from
  an ARM function, the return point is decoded in ARM mode.

  * The low bit of a function address always determines
  whether or not it is ARM vs. Thumb mode, regardless of
  how it is discovered. i.e. thumb functions discovered
  from thumb mode still have their low bit set in the
  resulting discovery state
  • Loading branch information
danmatichuk committed Sep 1, 2023
1 parent 0686e5d commit 7de0b68
Show file tree
Hide file tree
Showing 2 changed files with 71 additions and 3 deletions.
4 changes: 2 additions & 2 deletions macaw-aarch32/src/Data/Macaw/ARM.hs
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ import qualified Data.ElfEdit as EE
import Data.Macaw.ARM.Arch
import Data.Macaw.ARM.Disassemble ( disassembleFn )
import Data.Macaw.ARM.Eval
import Data.Macaw.ARM.Identify ( identifyCall, identifyReturn, isReturnValue, conditionalReturnClassifier, conditionalCallClassifier )
import Data.Macaw.ARM.Identify ( identifyCall, identifyReturn, isReturnValue, conditionalReturnClassifier, conditionalCallClassifier, wrapClassifierForPstateT )
import qualified Data.Macaw.ARM.ARMReg as ARMReg
import qualified Data.Macaw.ARM.Semantics.ARMSemantics as ARMSem
import qualified Data.Macaw.ARM.Semantics.ThumbSemantics as ThumbSem
Expand Down Expand Up @@ -55,7 +55,7 @@ arm_linux_info =
, MI.rewriteArchTermStmt = rewriteTermStmt
, MI.archDemandContext = archDemandContext
, MI.postArchTermStmtAbsState = postARMTermStmtAbsState preserveRegAcrossSyscall
, MI.archClassifier = conditionalCallClassifier <|> conditionalReturnClassifier <|> MD.defaultClassifier
, MI.archClassifier = wrapClassifierForPstateT (conditionalCallClassifier <|> conditionalReturnClassifier <|> MD.defaultClassifier)
}

archDemandContext :: MDS.DemandContext ARM.AArch32
Expand Down
70 changes: 69 additions & 1 deletion macaw-aarch32/src/Data/Macaw/ARM/Identify.hs
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,19 @@
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE ViewPatterns #-}

module Data.Macaw.ARM.Identify
( identifyCall
, identifyReturn
, isReturnValue
, conditionalReturnClassifier
, conditionalCallClassifier
, wrapClassifierForPstateT
) where

import Control.Applicative ( (<|>) )
import Control.Lens ( (^.) )
import Control.Lens ( (^.), (%~), (&) )
import Control.Monad ( when )
import qualified Control.Monad.Reader as CMR
import qualified Data.Foldable as F
Expand All @@ -36,6 +38,9 @@ import qualified Data.Sequence as Seq
import qualified SemMC.Architecture.AArch32 as ARM

import Prelude
import qualified Language.ASL.Globals as ASL
import qualified Data.Set as Set
import Data.Macaw.AbsDomain.AbsState

-- | Test if an address is in an executable segment
isExecutableSegOff :: MC.MemSegmentOff w -> Bool
Expand Down Expand Up @@ -359,3 +364,66 @@ conditionalCallClassifier = do
Jmp.TrueFeasibleBranch _ -> fail "Infeasible false branch"
Jmp.FalseFeasibleBranch _ -> fail "Infeasible true branch"
Jmp.InfeasibleBranch -> fail "Branch targets are both infeasible"

-- | Set the least significant bit of a segment offset.
setSegOffLeastBit :: MM.MemWidth w
=> MC.MemSegmentOff w
-> MC.MemSegmentOff w
setSegOffLeastBit addr = case MM.incSegmentOff (MM.clearSegmentOffLeastBit addr) 1 of
Just addr' -> addr'
Nothing -> error "PANIC: setSegOffLeastBit"

{- |
Wrap an existing 'MAI.BlockClassifier' to ensure that the value of the
PSTATE_T register is consistent, so that ARM/Thumb mode switching is
correctly followed. Specifically two changes to the resulting 'ParsedContents'
are made:
* Override the post-state of any intra-jump targets to always restore
PSTATE_T to what it was at the *start* of the block, to ensure that all blocks within
the same function use the same decoder.
This is necessary because BLX sets PSTATE_T to 1 before jumping. The current
register-preservation logic will therefore incorrectly consider the value of PSTATE_T to always be 1
at the return point after a call to a thumb-mode function.
Instead, we overwrite the value of PSTATE_T to the value it had at the beginning of
the block. Since the only instructions that change PSTATE_T are block terminators,
we can assume that the value of PSTATE_T will be unchanged during a single block.
* Modify any new function addresses to ensure that the low bit agrees with the final
value of PSTATE_T. This is to avoid macaw producing duplicate blocks for functions
which are called from both ARM and Thumb mode.
When in thumb mode, calling a thumb-mode function does not require having the low
bit of its address set, since the value of PSTATE_T tells the processor to continue
decoding in Thumb mode. If this function is called from both thumb and ARM mode, then
macaw will produce two duplicate blocks for it, one where the low bit is set and
one where it is not.
Instead, by forcing the function address to always agree with the value of PSTATE_T, we
ensure that the low bit of any function address necessarily determines whether or not
it is thumb vs. arm mode, regardless of how it is called.
-}
wrapClassifierForPstateT ::
MAI.BlockClassifier ARM.AArch32 ids -> MAI.BlockClassifier ARM.AArch32 ids
wrapClassifierForPstateT f = do
bcc <- CMR.ask
let initRegs = MAI.classifierInitRegState bcc
let pstateT_reg = AR.ARMGlobalBV (ASL.knownGlobalRef @"PSTATE_T")
parsedContents <- f
-- restore PSTATE_T to its value at the start of the block
parsedContents' <- case initRegs ^. MC.boundValue pstateT_reg of
MC.BVValue _ init_pstateT -> do
let setPstateT (ret,absSt,bounds) = let
abs_pstateT = MA.FinSet (Set.singleton init_pstateT)
absSt' = absSt & MA.absRegState . MC.boundValue pstateT_reg %~ (\_ -> abs_pstateT)
in (ret, absSt',bounds)
let tgts' = map setPstateT (Parsed.intraJumpTargets parsedContents)
return $ parsedContents {Parsed.intraJumpTargets = tgts' }
_ -> return parsedContents
let final_pstateT = MAI.classifierFinalRegState bcc ^. MC.boundValue pstateT_reg
-- set the low bit of any discovered functions to agree with the final value of PSTATE_T
case transferValue (MAI.classifierAbsState bcc) final_pstateT of
MA.FinSet (Set.toList -> [b]) -> do
let setBit addr = if b == 1 then setSegOffLeastBit addr else MM.clearSegmentOffLeastBit addr
let newFunctionAddrs' = map setBit (Parsed.newFunctionAddrs parsedContents')
return $ parsedContents' {Parsed.newFunctionAddrs = newFunctionAddrs' }
_ -> return parsedContents'

0 comments on commit 7de0b68

Please sign in to comment.