Monday, April 30, 2012
File poller issues
You setup a process using file poller as the starter activity. In the process you are trying to read the file. Imagine the file sizes are large in the order of 10 MB. E.g 20 or 30 MB. Then the immediate issues you would are, if the files handles a not releases by a source application which is generating these files, and file poller tries to read these files, what result you would get? IOException? Deeper question is, when does the create event trigger? After the file handle is released?
How to avoid reading incomplete files? How to improve the design of this process?
One way could be, try moving the files to a temp folderr, and if that succeeds, the you know the file is written completely and you can read the file content to memory and then process it.
But the imagine if the business requiment doses not allow you to move the file. Then what would you do?
If for some reason, you come to know that the files are written at 1 am, then instead of using file poller, using a timer scheduled at 2 am, and reading the files would solve the issue. But what if you do not know and you have to process in near real time?
On top of this, imagine of you are required to poll only the files that are modified today. How would you achieve that? Simple right? The output of file poller would give the last modified attribute which you an use to filter the files.
Imagine this process is restarted because of the machine reboot, system issue etc, the poller would poll all the files again? How would you avoid duplicats?
Thousand questions on a single activity. Using TIBCO simplifies lot of things, but you still have to care a lot to know.
Another TIBCO low level blog from Rajendrq Parida.
Cheers
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment