## Just Nest It

The usefulness of nested functions in MATLAB has been the subject of heated debate (check out the comments in Loren Shure’s post Nested Functions and Variable Scope for instance) mainly due to the complexity of data flow.

This is not an attempt to completely disprove the aforementioned complexity arguments. The relative complexity is there but one might argue that it is inevitable due to data sharing. Probably an editor affordance (Tim Davis’s “definitions”: (1) the ability of one to pay the costs for attending a prom, (2) what you do if you turn the wheels too sharply in your Mustang while driving on ice; Joel Spolsky’s definition in the context of a GUI) could make the situation more tenable and the coding process less error prone. But that is another story. This is, however, an attempt to showcase the beauty and versatility that nested functions can bring to a piece of code, providing the author is mindful of when and how to take advantage of the data sharing features they provide, without compromising the maintainability.

Before embarking on this somewhat tortuous journey, a refresher is in order. What follows is a list of points to bear in mind when dealing with nested function, excerpted from either MATLAB documentation on Nested Functions or the discussions in the comments of the related posts by Loren Shure (both with modifications).

• A nested function can be called from
– the level immediately above it.
– a function nested at the same level within the same parent function.
– a function at any lower level.
• Nested functions are not accessible to the `str2func` or `feval` function. You cannot call a nested function using a handle that has been constructed with `str2func`. And, you cannot call a nested function by evaluating the function name with feval. To call a nested function, you must either call it directly by name, or construct a function handle for it using the `@` operator.
• As a rule, a variable used or defined within a nested function resides in the workspace of the outermost function that both contains the nested function and accesses that variable. The scope of this variable is then the function to which this workspace belongs, and all functions nested to any level within that function.
The special case of `varargin` and `varargout` variables is particularly interesting: if a nested function includes `varargin` or `varargout` in its function declaration line, then the use of `varargin` or `varargout` within that function returns optional arguments passed to or from that function. If `varargin` or `varargout` are not in the nested function declaration but are in the declaration of an outer function, then the use of `varargin` or `varargout` within the nested function returns optional arguments passed to the outer function.
• Variables containing values returned by a nested function are not in the scope of outer functions.
• Externally scoped variables that are used in nested functions for which a function handle exists are stored within the function handle. So, function handles not only contain information about accessing a function, for nested functions, they also store the values of any externally scoped variables required to execute the function.
It is interesting to note that `whos fh_nested`, where `fh_nested` is a function handle to a nested function, will only show the size of the function handle itself not everything that it encapsulates. This can be verified by examining the structure `struct_fh_nested = functions(fh_nested)`.
• Variables cannot be “poofed” into the workspace of nested functions. [The way Loren Shure, so eloquently, put it.] The scoping rules for nested, and in some cases anonymous, functions require that all variables used within the function be present in the text of the M-file code. MATLAB issues an error if you attempt to dynamically add a variable to the workspace of an anonymous function, a nested function, or a function that contains a nested function. An important operation that causes variables to be dynamically added to the workspace is loading variables from a MAT file using `load` without an output, which can be avoided by using the form of `load` that returns a MATLAB structure.

The examples that follow illustrate what might be dubbed memory effects, or, as T. Driscoll calls them in Learning MATLAB, “lasting side effects.” (Side effects refer to the notion of functions in imperative programming, i.e., to the fact that the same language expression can result in different values depending on the state of the executing function. This implies referential opaqueness, or equivalently lack of referential transparency, which in turn implies impossibility of memoization.)

The first example features two functions that perform the same task of returning segments of an Excel sheet, advancing daily. An initial array is read during the first run of either function and is used as a cache for past data up to the current day. In a subsequent run, another array is read and used as a cache for future data, from which every time the current day data row is picked and appended to the past data cache, dropping the most obsolete (i.e., the first) row of the past data cache every time.
One of these functions is implemented using an ordinary, m-file, function (which can be made a subfunction of the invoking function too) and the other using a nested function. The differences between the two implementations are highlighted.

Implementation based on Ordinary functions/Subfunctions

```
function [curDate,E] = readInputFileDaily_sub(inpFileName,inpSheet,...
colRange,endDate,varargin)

persistent isFirstRun rowSheet rowE prevEs newEs newDates

optArgs   = {2,252,500};
emptyArgs = cellfun(@isempty,varargin);
[optArgs{~emptyArgs}] = varargin{~emptyArgs};

[sheetStartRow,rowInitEst,nDays2Read] = optArgs{:};

nCols = colRange{2} - colRange{1} + 1;

if isempty(isFirstRun)
isFirstRun = false;
rowSheet   = rowInitEst;
inpRange   = [colRange{1} num2str(sheetStartRow) ':' ...
colRange{2} num2str(rowInitEst)];

[prevEs,initDate] = xlsread(inpFileName,inpSheet,inpRange);

notNan 	  = all(~isnan(prevEs),2);
prevEs    = prevEs(notNan,:);
initDate  = initDate(notNan,1);

curDate   = initDate(end);
E         = prevEs;
return
end

while true
rowSheet = rowSheet + 1;

if isempty(newEs) || (rowE == size(newEs,1))
rowE     = 1;
inpRange = [colRange{1} num2str(rowSheet) ':' ...
colRange{2} num2str(rowSheet + nDays2Read - 1)];

[newEs,newDates] = xlsread(inpFileName,inpSheet,inpRange);
else
rowE = rowE + 1;
end

curDate  = newDates(rowE);

if strcmp(curDate,endDate) || isempty(curDate)
[curDate,E] = deal({},[]);
return
end

curRowE = newEs(rowE,:);
if all(~isnan(curRowE)) && (length(curRowE) == nCols - 1)
E = [prevEs(2:end,:);
curRowE];
break
end
end

prevEs = E;

```

Implementation based on Nested functions

```
function fh_readInputFileCore = readInputFileDaily_nested(...
inpFileName,inpSheet,colRange,endDate,varargin)

optArgs   = {2,252,500};
emptyArgs = cellfun(@isempty,varargin);
[optArgs{~emptyArgs}] = varargin{~emptyArgs};

[sheetStartRow,rowInitEst,nDays2Read] = optArgs{:};

nCols = colRange{2} - colRange{1} + 1;

fh_readInputFileCore = @readInputFileCore;

[rowSheet,rowE]        = deal(0);
[prevEs,newEs]         = deal([]);
newDates               = {};

function [curDate,E] = readInputFileCore()

if isempty(prevEs)
rowSheet = rowInitEst;
inpRange = [colRange{1} num2str(sheetStartRow) ':' ...
colRange{2} num2str(rowInitEst)];

[prevEs,initDate] = xlsread(inpFileName,inpSheet,inpRange);

notNan    = all(~isnan(prevEs),2);
prevEs    = prevEs(notNan,:);
initDate  = initDate(notNan,1);

curDate = initDate(end);
E       = prevEs;
return
end

while true
rowSheet = rowSheet + 1;

if isempty(newEs) || (rowE == size(newEs,1))
rowE     = 1;
inpRange = [colRange{1} num2str(rowSheet) ':' ...
colRange{2} num2str(rowSheet + nDays2Read - 1)];

[newEs,newDates] = xlsread(inpFileName,inpSheet,inpRange);
else
rowE = rowE + 1;
end

curDate = newDates(rowE);

if strcmp(curDate,endDate) || isempty(curDate)
[curDate,E] = deal({},[]);
return
end

curRowE = newEs(rowE,:);
if all(~isnan(curRowE)) && (length(curRowE) == nCols - 1)
E = [prevEs(2:end,:);
curRowE];
break
end
end

prevEs = E;
end
end

```

Here is a snapshot showing the differences of readInputFileDaily_sub and readInputFileDaily_nested side by side.

Differences of readInputFileDaily_sub and readInputFileDaily_nested

Note that except for `isFirstRun`, which is not used in `readInputFileDaily_nested`, the rest of `persistent` variables in `readInputFileDaily_sub` are externally scoped variables with respect to `readInputFileCore` and will be saved in the function handle returned by `readInputFileDaily_nested`.

Invoking readInputFileDaily_sub

```
function maxlikeTransformedData_sub

nTradeDays    = 252;

inpFilePath   = ['.' filesep];
inpFileName   = 'LEHMQ_AIG';
inpFileName   = fullfile(inpFilePath,inpFileName);

colRange      = {'A','D'};
sheetStartRow = 2;
initEstRow    = sheetStartRow + nTradeDays;
endDate       = '9/15/2008';

firms         = {'AIG','LEHMQ'};

inpSheets     = cellfun(@(firmName) ['Input' firmName],firms,...
'UniformOutput',false);

nFirms    = length(firms);
idxFirms  = 1:nFirms;
for iFirm = idxFirms
clear readInputFileDaily_sub

while true
[curDate,E] = readInputFileDaily_sub(inpFileName,inpSheets{iFirm}, ...
colRange,endDate,[],initEstRow);
if isempty(curDate)
break
end
disp(firms{iFirm}), disp(curDate), disp(E)
end
end

```

Invoking readInputFileDaily_nested

```
function maxlikeTransformedData_nested

nTradeDays    = 252;

inpFilePath   = ['.' filesep];
inpFileName   = 'LEHMQ_AIG';
inpFileName   = fullfile(inpFilePath,inpFileName);

colRange      = {'A','D'};
sheetStartRow = 2;
initEstRow    = sheetStartRow + nTradeDays;
endDate       = '9/15/2008';

firms         = {'AIG','LEHMQ'};

inpSheets     = cellfun(@(firmName) ['Input' firmName],firms,...
'UniformOutput',false);

nFirms    = length(firms);
idxFirms  = 1:nFirms;
for iFirm = idxFirms
readNextSegment.(firms{iFirm}) = ...
readInputFileDaily_nested(inpFileName,inpSheets{iFirm},...
colRange,endDate,[],initEstRow);
end

while true
for iFirm = idxFirms
[curDate,E] = readNextSegment.(firms{iFirm})();
if isempty(curDate)
idxFirms(idxFirms == iFirm) = [];
continue
end
disp(firms{iFirm}), disp(curDate), disp(E)
end
if isempty(idxFirms), break, end
end

```

Here is a snapshot showing the differences of maxlikeTransformedData_sub and maxlikeTransformedData_nested side by side.

Differences of maxlikeTransformedData_sub and maxlikeTransformedData_nested

The beauty of the code that uses nested functions should be apparent by now. For each firm (which corresponds to a sheet in the Excel file) in the cell array `firms` the function `readInputFileDaily_nested` is invoked once to create a unique function handle to `readInputFileCore` which, using dynamic field names, is stored in the structure `readNextSegment.(firms{iFirm})`. (This use of nested functions together with function handles can be thought of as creating light weight objects in an OOP context.) The actual reading of data for each firm is done by invoking the function `readInputFileCore` through the function handle `readNextSegment.(firms{iFirm})`. As the reading state (i.e., the externally scoped variables `rowSheet, rowE, prevEs, newEs, newDates`) for each firm (i.e., sheet) is stored in its corresponding function handle, it is possible to read a segment of all sheets simultaneously, perform the computations that depend on all these data segments being present (such as calculating the joint probability of an event for the firms, after estimating the parameters required for each individual firm), and proceed to another segment.
Clearly the implementation that uses ordinary functions (or subfunctions, for that matter) will not allow this simultaneous processing, as at each point in time there can only be one set of `persistent` variables for the function `maxlikeTransformedData_sub` and processing different sheets simultaneously requires independent sets of these variables at the same time. This requirement is also the reason the function `maxlikeTransformedData_sub` has to be cleared from memory before each sheet is processed—this makes sure we are not using the values left in the `persistent` variables from previous invocations of this function.

As another beautiful use of the memory effects provided by nested functions, let’s look at an example excerpted, with modifications, from “Learning MATLAB.”
Suppose our “objective” is to find the smallest $x$ such that the maximum real part of the eigenvalues of a certain matrix $A(x)$ equals 1. This is a straightforward task using `fzero` but now suppose that not only the value of $x$ is sought but also we would like to know the eigenvalues of the matrix $A(x)$ at the “optimal” $x$, without repeating the eigenvalue computation—this is a perfectly reasonable “constraint” as we have already done that computation and should not need to redo it.
Coding the objective as a nested function makes short work of fulfilling this task.

```
function x0 = findx
B  = diag(ones(49,1),1);
A  = B - B';
x0 = fzero(@objective,[0 10]);
plot(e,'*')

function r = objective(x)
A(1,1) = x;
e = eig(A); r = max(real(e)) - 1;
end
end

```

Again it is noted that the appearance of `e` (the highlighted line) in the definition of the parent function `findx` is critical for having its value shared between the parent and nested functions.

See also

John D’Errico’s `loopchoose` which is a “looped version of `nchoosek`. `nchoosek` can generate all combinations of a set of numbers. But sometimes that set can grow too large to store.” The solution is to generate each member of the set of all combinations in turn. `loopchoose ` does exactly that.

Attachments

LEHMQ_AIG.xls
(The extension of this file has been changed to doc to allow uploading to WordPress.com. To be used without changes to the code in this post, this file should be renamed to LEHMQ_AIG.xls after downloading.)

Advertisements
This entry was posted in Fundamentals and tagged , , , , . Bookmark the permalink.