Brothers In Code

...a serious misallocation of .net resources

The "Could Not Load File Or Assembly" Error Strikes in a New Way

I got bit by this old error in a new way the other day so I thought I should make a quick checklist for myself since I made some mistakes in my troubleshooting process. 

First I would start by reading How the Runtime Locates Assemblies.  However, unless you are depending one of the methods that change the normal process, like including additional "codebases" or publisher policy redirection, then we can widdle this down to a few simple causes:

  • A Referenced Assembly is not in the Application's Path
  • A Referenced Assembly is not in the Global Assembly Cache
  • The Referenced Assembly is a different version than the installed assembly.
  • The Application is 32 bit and the dll is 64 bit

The Old Problems - Local Files and the GAC

Many if not most applications rely on a simple xcopy deployment, meaning that all of the application's files will be bundled together in the same directory.  If you're simply copying or app folder from one place to another and it's not working on the new machine than either you simply missed a dll or you were using a GAC'd assembly on your development machine and didn't realize it.  If you're using a third party component that was installed with a setup program, than there's a good chance that your referenced assembly is in the GAC.  In that case you need to decide if you you want to run the third party install program as part of your deployment or figure out which dlls that you need and copy them to the folder.

GAC or otherwise, your app isn't going to find anything if it references a different version than what you have installed.  There are two options if realigning the deployed version isn't an option.  Either change "Specific Version" to false in the reference properties in Visual Studio or do a policy redirection.  I actually recommend the latter since big companies will include policies for the GAC that point old references to newer versions.  On the flip side, changing specific version to false will let your app load any version including an ancient one which might give you all sorts of strange errors.

 The New Problems - 32 Bit, 64 Bit and the Platform Target Configuration

It's the last cause that is the inspiration for this post since it has now bit me multiple times.  In my case i was referencing Oracle.DataAccess.dll from Oracle's data provider for .net.  ODP.net throws in some additional variables like PATH and "ORACLE_HOME" environment variable dependencies that threw me off correctly debugging the problem.  After double checking that the dll was in the GAC, and then even copying the dll to the app folder in desperation, I still was getting the error in question.  The error also included the phrase "or one of its dependencies" which kept doubling me back to a PATH problem.  In retrospect I really don't ever remember a case where this error was caused by a dependent dll so I'm not sure why I put so much stock in that route.

Finally I got my wits back and loaded a consistent savior, Sysinternals Process Monitor.  I don't know why I'm so slow to use this tool some times.  Maybe it's the 5 minutes you have to spend getting the filters right, but 9 times out of 10 it more or less tells me exactly what is wrong.  Sure enough I saw the program trying to find my dll in GAC_32.  Why is it looking for a 32 bit dll when we are on a 64 bit machine with the 64 bit oracle provider....you dumb-ass I thought to myself, knowing that I had seen this before.  Sure enough the task manager showed the infamous *32 next to my process.  I went back to visual studio and looked that the build properties for my exe project - once again Platform Target was set for "x86".  I set it to "Any CPU", recompiled and the stupid world was right again.

ORA-01036: Illegal Variable Name/Number Using ODP.net

If you've reading this you've probably been to several sites already and are just about ready to kill something. Why Oracle couldn't print the name of the parameter that you attepted to bind is beyond me, but after all this is a company that was very recently recommending "locator varibles." I'll be honest and say there is no magic bullet here. Compared to Sql Server's Profiler, Oracle's statement tracing functionality is not developer friendly and the ODP client side tracing is pretty worthless. I can really only give some tips for things to look for and some brute force debugging tactics.

To be clear, this is with Oracle's data provider (Oracle.DataAccess.Client) and not Microsoft's Oracle provider (System.Data.OracleClient). The Microsoft provider is being deprecated so it is not recommended that you follow some of the older suggestions of simply switching providers.

In short, this error is a catch all for some sort of mismatch between the parameters referenced in the command text, and the parameters added to the command object. To start there's a couple of simple things to look for:

  • Missing Comma - This one annoys the crap out of me. Forget a comma between constant values in an insert statement, and you'll get a nice "missing comma" error. However, forget a comma between bind variables and you'll get the error in question. My guess is that some genius decided to look only for an operator to terminate a bind variable and didn't consider whitespace.
  • The number of parameters match - This one is an important one and an easy one. We just ran into a problem where a misnamed command variable was causing an extra parameter to be added that we werent seeing. I simple cmd.Parameters.Count in the watch window would have quickly told us there were more params in the collection than were in the statement.
  • The order of the parameters match - Despite the warm fuzzy you might get when your parameter names match what is in your statement, those names do nothing by default. The default functionality is to bind the parameters in the order they are added. You can change this by setting OracleCommand.BindByName = true;
  • Use OracleCommand.BindByName=true if you use the same parameter twice. This is a bit of assumption since I've not tested to confirm it, but I assume that because the default functionality is to bind by order, duplicated parameters would also need to be duplicated on the command object if BindByName is left as false.
  • The type of the parameters match - The type parameter needs to match the type in the database. There is a little bit of flexibility with this with types like numerics (excluding overflow errors, OracleDbType.Int32 can be used in place of OracleDbType.Int64), but if you got lazy and defined parameter as OracleDbType.Varchar2 when you've got a Date column you might get this error. Generally I just use the Parameters.Add overload that takes variable name and value and let oracle decide on the type. This however does not work on output parameters - I set both the type and the length (at least for varchars) for those.

If the above eye-ball debugging doesn't work, I suggest that you start trimming down your statement and params until you get a successful execution and then work backwards from there. I know that's not much to go with but if I find something else I'll be sure to post it.

Oracle ODP.net xcopy deployment for asp.net.

I just got done wrestling with multiple versions of the oracle client.  We wanted to install an Asp.net app that uses the 11g ODP.net install on a server that already had a 10g client installed.  While 11g and 10gR2 are supposed to have decent support for multiple oracle homes, the 10g client that was installed was only R1.  To make matters worse, the app using the 10g client was an old asp app via an ODBC dsn.  So rather than risking messing up the existing asp apps, I decided to give the xcopy deployment a shot. 

Getting the actual software setup is pretty well documented with an article on Oracle's website.  It really came down to just unzipping the package and running:

install.bat odp.net20 c:\oracle\product\1.1.0-xcopy-dep odac11_xcopy

However, my whole point in doing this was to avoid polluting the existing environment so a couple of items were out.   First of all, the last step in the article is to add a couple of environment variables.  Because I didn't want to interfer with the oracle home of an existing client install, i wanted to avoid this.  However without it, I got a nice exception: "Oracle.DataAccess.Client.OracleException The provider is not compatible with the version of Oracle client"

The fix was to add the following to my Application_Start event in the global.asax:

Environment.SetEnvironmentVariable("PATH", @"c:\oracle\product\1.1.0-xcopy-dep;c:\oracle\product\1.1.0-xcopy-dep\bin;", EnvironmentVariableTarget.Process);
Environment.SetEnvironmentVariable("ORACLE_HOME", @"c:\oracle\product\1.1.0-xcopy-dep;", EnvironmentVariableTarget.Process);

Inline TNSNAMES

This got me to an oracle error that I recognized: "Oracle.DataAccess.Client.OracleException ORA-12154: TNS:could not resolve the connect identifier specified."  This is usally fixed by changing tnsnames.ora but in this case it didn't work.  While it might be possible to use tnsnames with this xcopy deployment, I decided on a different route.  I could have used the host/service name format in the connection string, but this oracle environment is setup with RAC (clustering).  The oracle driver has a nice feature where the tns infomation can be inline in the connection string. 
So this:"Data Source=MyTNSName;User Id=user;Password=pwd;"
became this (which is right out of my tnsnames.ora):

 "Data Source=  (DESCRIPTION =
          (ADDRESS = (PROTOCOL = TCP)(HOST = orahost1.company.com)(PORT = 1522))
          (ADDRESS = (PROTOCOL = TCP)(HOST = orahost2.company.com)(PORT = 1522))
          (LOAD_BALANCE = on)
          (CONNECT_DATA =
            (SERVER = DEDICATED)
            (SERVICE_NAME = MyTNSService.company.com)
            (FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))
          )
        )
      ;
      User Id=user;Password=pwd;"

Bypassing Distributed Transactions and the OracleMTSRecoveryService

That nearly got everything working.  The only thing left was a particular block of code that I had wrapped in a System.Transactions.TransactionScope block.  This normally requires the OraceMTSRecoveryService.  This is something that is installable with the xcopy deployment (run install.bat with no params for usage), but this would have overwritten an existing version of the service if it was already there.  Instead I decided to force the transactions to local only (it was only a single call to the db so I knew I wouldn't need a distributed transaction).  This could be done by adding "Promotable Transactions=local" to the connection string but I'm compiling against an older version of the driver (2.111.6.20 x64) in order to support both 64 and 32 bit with the same code and this option wasn't supported in this version.  Instead I changed the following registry key:

[HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\ODP.NET\2.111.7.20]
"PromotableTransaction"="local"

After that everything seems to be working.

P.S. 

Consider using a config change instead of a registry change with:
  <oracle.dataaccess.client>
    <settings>
      <add name="PromotableTransaction" value="local"/>
    </settings>
  </oracle.dataaccess.client>

Reason? I forgot to change the registry fix back and fought with "there was an error promoting the transaction to a distributed transaction oracle" for a day before I figured it out Embarassed.

5/17/2011 - Update:

I've done this again with the 4.0 provider and tnsnames.ora worked fine.  I think the first time I did this I didn't also update sqlnet.ora to use tnsnames with names.directory_path=(TNSNAMES,EZCONNECT). 

Another thing I don't remember doing in the 2.0 version was using the xcopydir\odp.net\bin\4\oraprovcfg utility.  I needed to put Oracle.DataAccess.dll in the GAC, but doing it manually didn't work.  I had to run oraprovcfg /action:gac /providerpath:<fullpath to Oracle.DataAccess.dll>

Last, I've also discovered that ODP.net supports the following configuration that should be a replacement for setting the path and oracle_home environment variables in code:


<configuration>
  <oracle.dataaccess.client>
    <add key="DllPath" value="c:\oracle\product\1.1.0-xcopy-dep\BIN"/>
  </oracle.dataaccess.client>
</configuration>

 See the following for more info:http://docs.oracle.com/html/E10927_01/InstallODP.htm

SQL Plus vs. PL/SQL and binding variables between the two.

I had a lightbulb moment yesterday.  My new employer is mostly an oracle shop so I end up spending a lot of time translating what I know about sql server into oracle terms.  But one thing I've struggled with is SQL Plus vs. PL/SQL.  That is until yesterday.   Maybe I never took the time before but a sequence of searches forced me in the right direction.

I'm using very few stored proceedures lately (reasons for which will come in a different post).  Consequently I was looking for ways to execute a batch of statements .net code with the Oracle Data Provider and found my answer here on OTN.  But then i looked at how he was getting variable data out of the command.  He was basically doing what I've always wanted to do with Oracle's SQL Developer.  When using the Sql Server query tools (query analyzer, SSMS), I'd often write myself maintenance procedures that I just kept in a file.  It's no problem to create variables and run sql statements in the exact same way you would from inside of a stored procedure.  Sometimes I could copy nearly the same code into a SqlCommand and it would run just fine.  Oracle isn't like that.  Variables (pl/sql variables at least) must be declared within the 'declare' section of a begin/end block.  These blocks can be 'anonymous' (not in a stored procedure) but once you enter the block things change.  You can't simply do a 'select column from table' for example or you get an 'INTO clause expected' error.  Oracle assumes that you're now in a programming block and not connected to some user interface so it wants you to store the data in some variable.  But how was this .net code getting this v_maxnum variable out?  On some sort of protocol level there must be a way for a human to do the exact same thing.

Finally thanks to Wikipedia, I got it.  SQL Plus is very similar to the combination of cmd.exe and SqlCmd/isql?  The big difference is that line between sqlcmd and t-sql is blurred a little more blurred than it is with SQL Plus and pl/sql.  But once I dug in to sqlcmd a little I realized that other than begin/end blocks it's really not that different.  I guess I really never used sqlcmd for anything but a single statement from the command line.

The big realization came now when I understood the variable types in SQL Plus.  It has both substitution variables (&var and &&var) and regular variables (var myvar varchar2).  But unlike sqlcmd's variables being t-sql variables, SQL Plus's variables are different than PL/SQL variables.  However, PL/SQL can 'bind' SQL Plus variables.  Take the following example.

var myDate varchar2;
begin
  select sysdate into :myDate from dual;
end;
/
print myDate

Now I finally understood what the data provider was doing and consequently I understood SQL Plus a lot better too.