universitystudyingsubject-3104
Teoria
Artificial Intelligence
libro: Artificial Intelligence - S. Russell, P. Norvig
Artificial Intelligence
2. Intelligent Agents
- Agente intelligente
- razionali (→ “fare la cosa giusta” → “ottenere il massimo grado di successo”)
- human-like
- Agente:
- può essere definito coma una funzione
: storia delle percezioni : azione da compiere
- percepisce l’ambiente tramite sensori
- agisce sull’ambiente tramite attuatori
- può essere definito coma una funzione
Task Environment
Task Environment = PEAS = descrizione problema che risolve un agente:
- definito da:
- Performance Measure: the notion of desiderability
- es. safe/fast/legal
- Environment
- Actuators
- Sensors
- Performance Measure: the notion of desiderability
- dimensions:
- unobservable/partially observable/fully observable
- i sensori vedono l’intero ambiente? è necessario avere uno stato interno per tenere traccia dell’ambiente esterno? O è sufficiente utilizzare i sensori?
- single agent/multi agent
- multi agent = esistono altri agent (per agent si intende un oggetto il cui comportamento influenza la nostra performance)
- multi agent può essere competitivo e cooperativo
- deterministic/stochastic
- se lo stato successivo dipende solo da quello attuale + azione
- n.b. se partially observable potrebbe sembrare stocastico ma non è detto
- stochastic = probabilità associata a stato
- uncertain = più outcome (non c’è probabilità numerica) = causato da non deterministic oppure non fully observable
- episodic/sequential
- Per scegliere la prossima azione è importante la storia delle percezioni?
- static/dynamic
- Mentre l’agent pensa, l’ambiente cambia?
- semi-dynamic: se il tempo influisce il performance score
- discrete/continuous
- dipende dal numero di stati e da cosa percepiscono i sensori
- known/unknown
- conosco gli stati causati dalle mie azioni?
- unobservable/partially observable/fully observable
Tipi di agenti
- Simple reflex
- sulla base di condizioni (if su sensori) scelgo l’azione
- Model based reflex
- memoria storico percezioni
- predizione ambiente in base ad azione
- Goal based
- ha un obiettivo
- si pensa: la prossima azione come influenzerà il mio obiettivo?
- Utility based
- si usa una funzione che associa allo stato un valore di “happiness”.
Learning agent
- diviso in componenti:
- performance element: l’agent vero e proprio (simple, model based, ecc)
- critic: tramite i percettori restituisce la performance dell’agente
- learning element: prende le critics e modifica il performance element per tentare di aumentare la performance
- problem generator: individua le azioni per ottenere nuove esperienze
Agent components
- atomic
- each state is a black box
- factored
- each state is represented by variables or attributes
- structured
- each state contains relationships - representation underlies a relational database
Problem-solving
3. Solving Problems by Searching
Problem components
- Initial state
- Actions: given a state, returns all the actions possibile
- Transition Model: given a state and an action, returns the result state (a.k.a. successor)
- State Space: the set of all reachable states from the initial state
- Search Tree: a graph with actions as branches and states as nodes
- Goal Test: determines if the a given state is a goal state.
- Path Cost: assigns a numeric cost to each path
Search tree
- frontier: set of nodes available for graph exploration
- Tree-Search: contains redundant paths
- Graph-Search: uses an “explored set” to avoid loops.
Measuring Problem-solving performance
- Types:
- Completeness: is the algorithm guaranteed to find a solution when there is one?
- Optimality: does the strategy find the optimal (lowest cost) solution?
- Time complexity
- Space complexity
- Variables
“branching factor”: maximum number of successors of a node “depth”: shallowest/closest goal node “maximum length”: maximum length of any path in the graph-search tree
Uninformed Search: we have only access to problem definition
Breath-first search
Expands the shallowest nodes first.
- Complete
- Optimal for unit step costs
- Exponential Space Complexity of
(“dominated by frontier size”)
Uniform-cost search
Expands the lowest path cost node (the frontier is implemented with a priority queue).
- Optimal for any cost
- Is not necessary better than BFS:
- it has to explore all the low-cost edges before reaching the goal.
- it requires all the generated costs (to get the minimum)
Depth-first search
Expands the deepest node.
- Complete in the graph-search version (no cycles)
- Not Optimal
- Linear Space Complexity
(where is the deepest path) - we can reduce this to
by expanding only the successor
- we can reduce this to
Depth-limited search
Expands the deepest node up to
- same as
, but stops when we reach depth - Not Complete
- Not Optimal
- Linear Space Complexity
Iterative deepening depth-first search
Uses Depth-limited search with
- Complete
- Optimal for unit step costs (as BFS)
- Linear Space Complexity
- Time complexity comparable to BFS (in terms of
).
Bidirectional Search
Starts two searches in parallel, one from the start and the other from the goal, until they meet (the frontiers touch - frontiers intersection is not null anymore). It is efficient since
- Complete
- Not Optimal (even if using BFS from both sides - can be fixed using a constant time check)
- Exponential complexity of
Informed Search
We have access to a heuristic function that estimates the path cost of a solution.
Intuition (general Best-First Search)
- until now, we used the path cost function
. In informed search, we introduce a heuristics function . : estimated cost of the cheapest path from the state at node to the goal state. Compared to , depends only by the state of the node . - we start from the general Tree-Search or Graph-Search algorithm and we use a new function
instead. is a function that takes in count (differently depending on the search strategy) the heuristics function. - The general implementation is the same as the Uniform-Cost Search changing the path cost function from
to .
Greedy Best-First Search
- Complete in the graph-search version with finite space
- Not Optimal
- Efficient
A* Search
- Complete
- Optimal if some condition match:
must be admissible: must never overestimate the cost to reach the goal. must be optimistic. must be consistent: the triangle inequality must be respected. - A* Tree-Search is optimal if
is admissible. Proof: - be
an expandable sub-optimal path, and an optimal path with the node in the frontier to reach it. will not be chosen.
- be
- A* Graph-Search is optimal if
is consistent.
- Being a BFS, it uses a lot of space, making it unsuitable for large state space problems.
Iterative Deepening A* Search:
It’s essentially iterative deepening depth-first search with the newly defined
Recursive Best-First Search
Similar to IDA*, but it uses less memory. IDA* stores every node to select the node with lowest
It uses linear space, is complete and optimal.
It can change his mind multiple times, so time complexity is hard to define.
Simplified Memory-bounded A* Search:
Aims to improve RBFS using more space if available.
It’s an hybrid between IDA* and RBFS. Uses IDA* until no more space is available to allocate a new node. Then, it drops the worst node and just saves the
This is a variant of Memory-bounded A* Search, which is not explained.
4. Beyond Classical Search
Hill-Climbing Search
- sometimes called greedy local search. It takes as successor the node nearest to the goal. It doesn’t track nodes using a tree, it just moves between nodes.
- Local Maxima is the problem of this algorithm: it’s not the global maximum and it gets stuck
Simulated Annealing
- Aims to reduce the Local Maxima problem of Hill-Climbing Search.
- It uses a temperature variable that decreases over time (over iterations).
- The algorithm chooses randomly the successor node: if it has a better cost compared to the current, it gets chosen, otherwise it gets accepted only by a probability that depends on the temperature, allowing to choose worse nodes at the beginning.
5. Adversarial Search
Games are well-defined problems that are generally interpreted as requiring intelligence to play well.
Introduces uncertainty since opponents moves can not be determined in advance.
Search spaces can be very large.
For chess:
- Branching factor: 35
- Depth: 50 moves each player
- Search tree: 35100 nodes (~1040 legal positions)
Games are an instance of the general search problem.
- States where the game has ended are called terminal states.
- A utility (payoff) function determines the value of terminal states, e.g. win=+1, draw=0, lose=-1.
Types of game:
Deterministic | Chance | |
---|---|---|
Perfect Information | Chess, Checkers, Go, Othello | Backgammon, Monopoly |
Imperfect Information | Battleships, Blind Tic-Tac-Toe | Bridge, Poker, Scrabble, Nuclear War |
Minimax Algorithm
- The Minimax algorithm gives perfect play for deterministic, perfect-information games.
- In two-player games, assume one is called MAX (tries to maximize utility) and one is called MIN (tries to minimize utility).
- In the search tree, first layer is move by MAX, next layer by MIN, and alternate to terminal states.
- Each layer in the search is called a ply.
Implementation:
- Generate complete game tree down to terminal states.
- Compute utility of each node bottom up from leaves toward root.
- At each MAX node, pick the move with maximum utility.
- At each MIN node, pick the move with minimum utility (assumes opponent always acts correctly to minimize utility).
- When reach the root, optimal move is determined.
It’s basically a BFS in space complexity.
function Minimax(state):
if TerminalTest(state) then return Utility(state)
A := Actions(state)
if state is a MAX node then return Minimax(Result(state, a))
if state is a MIN node then return Minimax(Result(state, a))
Alpha-Beta Pruning
The problem with minimax search is that the number of game states it has to examine is exponential in the depth of the tree.
Look at the intuition:
Minimax(root) = max(min(3, 12, 8), min(2, x, y), min(14, 5, 2))
= max(3, min(2, x, y), 2)
= max(3, z, 2) where z = min(2, x, y) ≤ 2
= 3
i.e., we don’t need to know the values of x and y!
The algorithm:
function AlphaBetaSearch(state):
v := MaxValue(state, −∞, +∞))
return the action in Actions(state) that has value v
function MaxValue(state, α, β):
if TerminalTest(state) then return Utility(state)
v := −∞
for each action in Actions(state):
v := max(v, MinValue(Result(state, action), α, β))
if v ≥ β then return v
α := max(α, v)
return v
function MinValue(state, α, β):
same as MaxValue but reverse the roles of α/β and min/max and −∞/+∞
With perfect ordering, the time complexity reduces from
These two algorithms are complete and optimal (if opponent makes optimal decisions), but aren’t suitable for real games where states are too many.
Imperfect Algorithms
Previous algorithm still has to search all the way to terminal states for at least a portion of the search space. This depth is usually not practical. In this chapter we apply a heuristic evaluation function to states in the search, effectively turning nonterminal nodes into terminal leaves.
H-Minimax Algorithm
It’s like Minimax, but there are two replacements:
TerminalTest
function withCutOffTest
function- the newly added
CutOffTest
function has access to the current node’s height.
- the newly added
Utility
function withEval
function
Evaluation and Utility Functions
Weighted Linear Evaluation Functions
This assumes all the features are independent of each other, which is usually not true.
We want more sophisticated cutoff tests.
Quiescient positions
- We only cut off search in quiescient positions.
- i.e. in position that are “stable”, unlikely to exhibit wild swings in value
- non-quiescient positions should be expanded further
Horizon effect
another problem - the horizon effect:
- if a bad position is unavoidable (e.g., loss of a piece in chess), but the system can delay it from happening, it might push the bad position “over the horizon”
- in the end, the resulting delayed position might be even worse
Monte Carlo Tree Search
Algorithms in non-deterministic games
6. Constraint Satisfaction Problem
Until now we assumed all the states were black boxes.
Now states are factored, so if we’re able to formulate a problem as CSP we can get more efficient solutions.
A constraint satisfaction problem consists of:
: a set of variables, . : a set of domains, , one for each variable. is a set of allowable values for .
: a set of constraints that specify allowable combinations of values. is a pair : tuple of variables ( ) that partecipate in the constraint : set of values (each value is a tuple of elements) that variables in can take on.
In a given state, we can set some values to one of more variables. Those are called assignments.
- An assignment is consistent if it doesn’t violate any constraint.
- An assignment is complete if it’s consistent and every variable is assigned.
- An assignment is partial if it’s consistent and at least one variable is assigned.
We can represent a CSP with a Constraint Chart:
- Nodes are the variables of the problem
. - An edge between two nodes exists if there’s at least a constraint
that includes both of the nodes as .
N-Queens example
Implementation with Search
We can solve any CSP using Search. The entities are defined as follows:
- Initial state: all the variables are unassigned.
- Actions: assign a value to a variable
- Goal function: the variables assignment is complete.
In each state, we store:
- The
list: all the variables not assigned. - The
list: all the variables assigned. - Additionally, for every variable:
: not really useful for the algorithm itself : : the current value, if present.
Constraints could be represented as sets or as functions that check for validity.
We can use DFS as search algorithm.
(number of variables) (number of variables)
Constraints could be redundant. We can apply a series of reductions to simplify them.
Consistency
These can be used to reduce the amount of legal values and could potentially find the solution directly.
- Node Consistency:
- we consider the unary constraints.
- this is straightforward: we iterate all the variables and exclude the values in the domain that don’t match the unary constraints.
- Arc Consistency:
- we consider the binary constraints
- we say
is arc-consistent to if for every value there is some value in that satisfies the binary constraint on the arc - A real implementation algorithm is called
- Path Consistency
- we consider the ternary constraints
- “A two-variable set
is path-consistent with respect to a third variable if, for every assignment consistent with the constraints on , there is an assignment to that satisfies the constraints on and ” - A real implementation algorithm is called
-Consistency - it’s the generalisation of the previous reductions
→ Node → Arc → Path
- it’s the generalisation of the previous reductions
If we impose
Backtracking Search
We use DFS with fixed assignment order: variable assignment is a commutative operation, sorting ensures getting a faster algorithm.
The algorithm is recursive and it does the following: iterates all values of all variables in
- It returns failure if the state is inconsistent (to let the caller iterate the other values)
- It continues until the values are finished (there’s no solution) or the assignment is complete (the problem has been solved).
- For each variable it imposes a certain level of consistency, depending on the problem ^a6a0fe
- note that we do this before performing the search.
If we need to just find a solution, we can apply some ordering tweaks to improve performance.
Of course, if we need all the solutions, ordering doesn’t matter: we have to explore the whole graph.
Variables ( ) Order
How do we order the iterated variables?
- Minimum Remaining Values (MRV) Heuristics:
- The variables chosen first are the ones with fewer legal values
- a.k.a. “fail-first” heuristic → if we find a variable fails, we can avoid checking other permutations on the other variables.
- This makes no sense if the variables have the same amount of legal values (e.g. N-Queens example).
- Degree Heuristics:
- The variables chosen first are the ones involved in the largest number of constraints.
We could combine both of the heuristics: default to MRV and switch to Degree Heuristics in case of legal values amount equality.
- The variables chosen first are the ones involved in the largest number of constraints.
Domain values Order
We then need an heuristic to iterate the values in the domain of the selected variable in a more efficient manner.
- Least Constraining Value:
- we apply the same (but opposite) logic of MRV. We choose the values that rule out the fewest amount of choices for the neighboring variables in the constraint graph.
Forwarding Checking
Inference: imposing a level of consistency to a variable.
Whenever a variable is assigned, we can perform a simple form of inference called Forward Checking: applying arc-consistency for it. Only useful if we don’t already do it in the preprocessing step
todo non lo faccio già? qual è il preprocessing? forse qui possiamo applicare un livello di consistenza più alto e quindi più costoso
Local Search
Instead of iterating all the values in all the variables, we select a random variable that violates some constraint.
This algorithm can use an heuristics called min-conflicts which selects the variable that violates the minimum amount of constraint conflicts.
This works very well for big problems, because it doesn’t depend by the problem size.
Knowledge, reasoning, and planning
7. Logical Agents
Instead of working with states and having code
to define the actions, now we use logic.
We use a Knowledge Base (KB), which is a set of sentences.
The knowledge base contains sentences that can either be axioms (taken for granted) or derived (from other sentences).
It’s possibile to interact with the knowledge base with two operations:
TELL
: a way to add a sentenceASK
: a way to query what is known.
The derivation is a way of inference, and is hidden behind theTELL
andASK
operations.
An agent maintains a knowledge base, which at start may contain some background knowledge.
Each time an agent is called, it performs the following operations:
MAKE-PERCEPT-SENTENCE
: it usesTELL
and it constructs a sentence asserting that the agent perceived the given percept in a given timeMAKE-ACTION-QUERY
: itASK
s what action should be done at the current timeMAKE-ACTION-SENTENCE
: constructs a sentence asserting that the chosen action was executed
function KB-AGENT( percept ) returns an action
persistent:
KB # a knowledge base
t #, a counter, initially 0, indicating time
TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
action ← ASK(KB, MAKE-ACTION-QUERY(t))
TELL(KB, MAKE-ACTION-SENTENCE(action, t))
t ← t + 1
return action
This approach is declarative (not procedural as we used before). We use the knowledge level just by specifying the desired goal, when querying we don’t know how the implementation level works, since it’s independent.
details about logic and propositional logic already covered in Strutture Discrete and Fondamenti di Informatica subjects.
How can we prove a sentence
We can either:
- Iterate the truth table
- Generate the derivation sequence
Language | Ontological Commitment (what exists in the world) | Epistemological Commitment (what an agent believes about facts) |
---|---|---|
Propositional logic | facts | true/false/unknown |
First-order logic | facts, objects, relations | true/false/unknown |
Temporal logic | facts, objects, relations, times | true/false/unknown |
Probability theory | facts | degree of belief ∈ [0, 1] |
Fuzzy logic | facts with degree of truth ∈ [0, 1] | known interval value |
8. First-Order Logic
This kind of logic allows us to represent the world in an easier way. Propositional Logic isn’t suitable to represent knowledge of complex environments. First-Order Logic is sufficiently expressive to represent a good deal of our commonsense knowledge.
Base Elements
- Constant symbols: objects (e.g.
Richard
orJohn
) - Predicate symbols: relations (e.g.
Person
,King
orCrown
) - Function symbols: functions (e.g.
Brother
,AgeOf
,SquareRoot
) - Variables:
- Connectors:
- Quantifier:
- Equality:
Difference between predicate and function symbols:
- functions return a value, predicates are relations among values in a tuple of objects.
is a predicate since it just asserts is red. is a function since it returns the father of . could be a predicate if it meant ” is the father of “
General Rules for Exercises
Universal Elimination
We remember the Modus Ponens inference rule:
The Universal Elimination is useful when we’re dealing with quantifiers:
Of course,
We call
This uses the concept of substitution.
We need something to understand that
We need a process that finds
We call this process unification - it’s a function that returns a unifier, if such exists:
Generalized Modus Ponens
Basically it’s modus ponens with the unification.
todo
9. Inference in First-Order Logic
We have our KB. In practice, how do we apply inference?
Forward Chaining Algorithm
It’s called “Forward” since it gets executed whenever a new fact is added to the KB.
It iterates all the rules in the Knowledge Base and for each of them it checks if there are new inferences that can be done. If so, the algorithm continues, until there are no more.
This is like brute force.
Let’s analyze the algorithm:
- Every inference is an application of Generalized Modus Ponens
- It is complete if the knowledge base has definite clauses.
- The naive implementation is highly inefficient but it can be improved considerably.
Backward Chaining Algorithm
It’s called “Backward” since it gets executed whenever a new query is performed on the KB.
It starts from the goal and reaches the axioms backwards.
We perform basically a DFS.
- We look for rules such
- We generate the unifiers and proceed recursively
Resolution
Conversion to CNF
Theorem
Every sentence of first-order logic can be converted into an inferentially equivalent CNF sentence
notice the “inferentially”
These are the steps:
- Eliminate the implications
- Remember that
is the same as
- Remember that
- Move
inwards - For example
becomes - We can use De Morgan rules as well
- For example
- Standardize variables
- If we have multiple quantifiers, we make sure that each variable has a unique name (useful for following steps)
- Skolemize
- With this step we are able to maintain only universal quantifiers (
), deleting the existential ones ( ). - We first move the quantifiers all to the left, respecting their original order.
- Then, we have to remove the
symbols. - Let’s take this example:
. - This is valid for a generic value that depends on
- So we can write this as
- In general: what to substitute depends on the amount of quantifiers the
symbol has on its left. If it has none, we can replace it with an arbitrary constant (called Skolem Constant). Otherwise we replace it with a function that has as many parameters as many quantifiers it has on its left.
- Let’s take this example:
- With this step we are able to maintain only universal quantifiers (
- Drop universal quantifiers
- Distribute
over - We use the distribution property
Proof
Now, we have to prove a sentence
Resolution proves that
We assume the knowledge is already in CNF. If not, we use the previous algorithm.
- We negate
. - We transform it to CNF
- We add it to the knowledge base
- We apply inference and look for a contradiction.
Uncertain knowledge and reasoning
13. Quantifying Uncertainty
probability already covered in Strutture Discrete subject.
Syntax
We can have random variables. We could write something like:
P (Weather = sunny) = 0.6
P (Weather = rain) = 0.1
P (Weather = cloudy) = 0.29
P (Weather = snow) = 0.01
or as an abbreviation: P(Weather) = <0.6, 0.1, 0.29, 0.01>
(assuming a predefined ordering: <sunny, rain, cloudy, snow>
)
Values must me normalised (the sum must be equal to one).
It’s also possible to define the probability of multiple variables together.
It’s a table called Joint Probability Distribution.
In the following example W
is weather and C
is cavity.
P (W = sunny ∧ C = true) = P (W = sunny|C = true) P (C = true)
P (W = rain ∧ C = true) = P (W = rain|C = true) P (C = true)
P (W = cloudy ∧ C = true) = P (W = cloudy|C = true) P (C = true)
P (W = snow ∧ C = true) = P (W = snow |C = true) P (C = true)
P (W = sunny ∧ C = false) = P (W = sunny|C = false) P (C = false)
P (W = rain ∧ C = false) = P (W = rain|C = false) P (C = false)
P (W = cloudy ∧ C = false) = P (W = cloudy|C = false) P (C = false)
P (W = snow ∧ C = false) = P (W = snow |C = false) P (C = false)
Independence
Some variables are independent on others. for example
P (cloudy | toothache, catch, cavity)
can be written as P (cloudy)
The property used is called independence.
Independence between propositions can be written as follows:
P(a | b) = P (a)
, P(b | a) = P(b)
or P(a ∧ b) = P(a)P(b)
.
This is useful because it can reduce the number of independent cases.
Conditional Independence
16. Making Simple Decisions
A knowledge base is not-monotonous if logic deductions could change due to new facts.
Utility theory: used to represent preferences
Decisions theory: utility theory + probability theory
Games theory: decisions theory against players whose decisions minimize our utility theory
Learning
18. Learning From Examples
training set: a set of examples in the form
Curve fitting: take the simplest consistent function (Ockham’s razor)
Examples are:
- described by attributes (boolean)
- classified by an outcome
Decision Tree Learning
Each node is an attribute and each branch is a possible value.
In case of booleans, the tree is a binary tree.
Goal: find the smallest consistent tree.
The idea is to choose recursively the most significant attribute (the one that splits better the examples by outcome - in the case of booleans).
Let’s define formally the definition of split better
Entropy of a variable
If a variable is boolean the entropy becomes as follows:
To calculate the entropy of a training set T:
axx: If want to know how much a single attributes takes part in that
Information Gain:
We can choose the attribute with highest
Performance Measure
How do we know
A bigger training set implies a better prediction.
The learning curve is the percentage of correctness over the training set size.
A function
This doesn’t not apply for each algorithm. If it does, the algorithm is called PAC Learning algorithm.
A PAC Learning Algorithm takes as input a set of examples and two variables
in other words
- A function
is probably approximately correct with accuracy . - A function
is probably approximately correct with confidence .