kerravon86@yahoo.com.au [hercules-os380]
2018-03-12 08:44:37 UTC
I have now consolidated my hopes for MVS/380
at the moment, and Peppe, I need a guinea pig.
PDPCLIB has moved into a position that I am
not happy with, and I'd like to work with you to
bed in a paradigm that I am happy with. And
the best thing for that would be to work with
that assembler program you were writing to
read an IEBCOPY unload file that had been
FTPed so you need to reconstruct the records.
Basically what I would like is for you to pretend
that BTL memory is tight so you are trying to
allocate 130 KB of memory in AT2B (RM32)
space to alleviate pressure on the RM31
space. The 130 KB is for two complete tape
records in case a V record spans two
blocks, and tape blocks are potentially 64 KB
in size.
I know you can use a simpler algorithm, but if
you use this slightly more complex algorithm,
it gives an actual program where we can
demonstrate how to write assembler programs
properly. Also the act of doing this means that
my writing needs to make enough sense that
a Unix programmer can understand it.
I can provide the code that does the AM32
handling if you can test it out. If it's all too
much work, that's fine, just forget it. And
non-Peppe people are free to comment on
the below proposals too.
Thanks. Paul.
Proposed future MVS/380 design and guidelines for
programming in all of MVS/380, MVS 3.8j, MVS/XA,
OS/390 and z/OS - 24-bit, 31-bit, 32-bit and 64-bit.
On the mainframe, just like modern (2018) Windows
and Unix, both 32-bit and 64-bit registers are
available for use, both for data registers and
address registers. The mainframe has traditionally
had an additional complication of address masking
which can restrict the amount of memory to 24 bits
or 31 bits. Unfortunately much code has been
written that makes an assumption that this masking
will occur, and the programs happily run with crud
in the top 8 or top 1 bit of 32-bit address
registers. In addition to that, some 32-bit programs
have assumed address truncation at 32 bits (by
using x'FFFFFFFF' as negative 1 in an address index),
so fail if an attempt is made to run them in AM64.
Those programs require an implied AM32.
But so long as programs are coded in accordance with
this document, there is nothing architecturally
preventing normal 32-bit programming or normal
64-bit programming, or even a combination of the two.
E.g. you could have 32-bit code pointers, 64-bit
data pointers, and 32-bit longs. It is totally up
to your C compiler and the options it provides and
you enable, or however you choose to write your
assembler code.
This document explains how to write your application
code in a flexible manner, but you have no control
over the operating system. Unfortunately much of
both MVS/XA and MVS/380 require execution in AM24,
and z/OS requires execution in AM31. As such, it is
necessary for your application to be able to switch
to the AMODE that the operating system requires,
noting that different operating systems can handle
different AMODEs. Your application should always be
more advanced or equal to the operating system (ie
you know in advance about all existing operating
systems before you write your program), as such,
your application is only ever expected to step *down*
to match the operating system, never *up*.
For a step-down to ever be possible, it requires that
the code that is being executed be located lower in
memory than the application data that is being addressed.
It also requires that any parameters that are given
to the operating system be located in lower memory
than the other application data. As a result of this,
there are distinct AMODEs and RMODEs:
RMODE: 24, 31, 32, 64
AMODE: 24, 31, 32, 64
Any combination of these can exist, provided that
the AMODE is always greater than or equal to the
RMODE.
So the RMODE is what this particular operating system
can handle, while the AMODE is what your application
has been written to allow. If you have written an
application that doesn't use "VL" or anything else that
may interfere with the top bit of a 32-bit application,
you may use AMODE 32. But all existing operating systems
require an AMODE of 24 or 31, so they will never load
your program into RMODE 32 or 64 space. But because
the operating system may allow 24 or 31, your program
will be marked differently depending on where it is
running. The software installer will typically need to
be informed whether the RMODE needs to be dropped from
RM31 to RM24 when installed on an MVS/380 system.
Because MVS/380 generally requires to be executed in
AM24, while z/OS can usually be executed in AM31.
At startup, your application can do a non-destructive
(works on any hardware) test to see what AMODE and
RMODE it has been invoked in, and it can adjust
appropriately.
The specific test for AMODE is to load a register with
x'FFFFFFFF' and then do a LA. If the top 8 bits have
been cleared, it is AM24. If no bits have been cleared,
it is AM32 or 64. If only the top bit is cleared, it
is AM31.
The specific test for RMODE is to load the address of
any routine. If the address is greater than or equal
to x'80000000', it is RM32. Else if the address is
greater than or equal to x'01000000', it is RM31.
Else the load module is RM24.
Testing 64-bit is different. At compile time you know
whether your program is using 32-bit or 64-bit data
pointers, and if you are building an AM64 program
you can just hardcode or assume the AMODE is 64. The
RMODE can be determined by doing the same test as
before, except using 64-bit registers and seeing if
any of the high 32-bits are non-zero, and if so, it
is RM64, not even RM32.
In the situation where the AMODE and RMODE are the
same, it means that the application and operating
system will execute in the same AMODE, so it is
not necessary to (and nor should you) switch AMODEs
prior to calling an operating system routine. By
sticking to this rule, your application will still
work on MVS 3.8j running on standard S/370 hardware,
because no BSM is executed. You are advised to limit
the rest of your application (or compiler) to
instructions that exist on S/370 too, and only use
MVCLE and CLCLE when you know you are dealing with
a buffer of 16 MiB or more in size.
Basically every operating system macro like READ
needs to be bracketed with some sort of macro like
GAMOS (go to the OS's amode) prior to execution,
and GAMAPP (go to the application's amode) after
execution. You also need to ensure that you build
your application on a system like MVS/380 where
the OS macros are 31-bit or 32-bit clean in order
for your application to be marked as something
above RMODE 24.
In some ways z/OS is better than MVS/380 and in
some ways it is the reverse. You are advised
to code for the lowest common denominator, and
MVS/380 is the best place to do that.
Let's first clarify where MVS/380 is superior.
z/OS does not provide any convenient way to
allocate AT4B (above the 4 GiB bar that 32-bit
addressing has as a limit) memory. It has an
IARV64 macro but the details of that are not
open source so it cannot be reproduced. Also
it is a dog to call (complex and space is
allocated as a minimum of 1 MiB). MVS/380 instead
provides LOC=64 as a parameter to conveniently
get RM64 (AT4B) memory.
Also z/OS does not provide a way for 32-bit
programs to obtain AT2B (above the 2 GiB bar
that 31-bit addressing has as a limit) memory.
Not even IARV64 will get around this problem.
The 2 GiB to 4 GiB range of memory is completely
out of bounds, which is something IBM has
copped flak for in the past. MVS/380 instead
provides a LOC=32 parameter to get RM32 (AT2B)
memory.
The RM64 deficiency of z/OS could be overcome
by having a 3rd party intercept for SVC 120
that would issue an IARV64 call. But the RM32
deficiency requires z/OS to be updated.
The good news is that you can code LOC=32, and
it switches on the high bit of the option byte,
while using the LOC=31 flags for the other 7
bits, meaning that you can write and run a
program on MVS/380 that uses the full 4 GiB
address space, and the program will still run
on z/OS, just in degraded mode, only being
able to allocate a maximum of 2 GiB of memory.
The LOC=64 parameter also switches on the top
bit of the option byte, while leaving the
other bits at the default LOC=RES setting,
meaning it will probably obtain BTL (below
the 16 MiB line, aka RM24) memory unless you
have an SVC 120 intercept in place.
z/OS is superior to MVS/380 mainly in the fact
that the OS routines can handle being executed
in AM31, so a module marked RM24 on MVS/380
can probably be marked as RM31 on z/OS, freeing
up the BTL memory that would otherwise be
occupied by your load module. Since load modules
are normally not very big, and the big space
filler is data, not code, this is of marginal
benefit. It does mean that GAMOS will not need
to execute a BSM, but that is also of marginal
benefit. There are other things too, like the
assembler handles a LG Rx,=LD(XXX), but this
is also of marginal benefit because the code
generator simply needs to generate a SLGR Rx,Rx
and L Rx,=L(XXX) to achieve the same thing, ie
load a 64-bit data pointer. And note that if
you are declaring a 64-bit pointer you would
typically have a YYY DS D.
Note that when doing a GETMAIN, if the current
AMODE is 24, then the LOC= will be ignored and
you will get RM24 memory. Ditto if the current
AMODE is 31, then a LOC=32 or LOC=64 GETMAIN
request will only obtain RM31 memory. In addition,
if the current AMODE is 32/64, then all GETMAIN
requests (ie even for LOC=24) will set all 64 bits
of R15 to provide a clean 64-bit programming
environment. On z/OS all 64 bits may be set even
if you are AM24/31. GETMAIN cannot distinguish
between AM32 and AM64 so a LOC=64 request will
obtain AT4B memory even if your 32-bit program
cannot handle it. A LOC=64 request will also
use 64-bit registers for the amount of data
being requested.
Another thing to note is that MVS 3.8j will
ignore the extra flags in the GETMAIN option
byte that it doesn't know about, so the
recommended practice for your programs that
are using 32-bit data pointers is to use
LOC=32 and mark your load modules AM32=64.
Your programs will still work fine on MVS 3.8j.
If you fail to specify a LOC= to your GETMAIN,
it will default to LOC=RES, ie the location of
the memory will depend on whether your module
has been loaded BTL or ATL. In general this is
not a good option, and should be avoided. You
should instead specify an explicit LOC=24 for
any data that any of MVS/380, MVS/XA or z/OS
require to be kept BTL, or an explicit LOC=31
or LOC=32 for any data that can be stored ATL.
This avoids both wasteful use of resources and
abends.
Don't use the MODE=31 option for OPEN as it
doesn't work on MVS/380.
Don't assume that R15 points to the program
entry point. On MVS/380 the bottom bit may
actually be 1.
When you write a 64-bit application in any
form, ie 64-bit longs and/or 64-bit data
pointers, you should establish a F5SA save
area on program entry and from then on use F4SA.
To cater to AM32 programs that rely on address
truncation at 32 bits, the entire 4 GiB to
8 GiB virtual address space is left vacant on
z/380 so that any virtual memory address in this
range is truncated. In addition, MVS/380 GETMAIN
will not return memory located in the last 2
4 KiB pages before 4 GiB nor the first 4 KiB
pages after 8 GiB. Nor will the operating system
use that space for anything else in case a
32-bit program does a STD 4095(1,2) where
registers 1 and 2 are set to x'FFFFFFFF'. Also
note that there may be an implementation where
there is a different PSW bit to distinguish
between AM32 and AM64, but for now they are
the same thing, and there is no flag in the
module to distinguish this anyway. There could
theoretically be a different way of identifying
AM32 programs in the future, such as maintaining
a list of them, but so far this has been deemed
unnecessary, and the main emphasis is that
AM64 programs should be treated as AM-infinity
and not expect truncation in the first place.
Unrelated to application programming, if you
can do a BSM to x'01' and it works, that is
a sign you are running on a minimal S/380
machine or above. You should do a SAM64 to
see if you are running on a z/380 or above.
There is a technical restriction on MVS/380
that ATL (RM31) memory is only cleared between
steps of a batch job, not between TSO commands.
It is expected that mvs380mn will be renamed
to mvs380, written in C (calling assembler
routines), and have a "clearatl" parameter
to clear the ATL memory until someone constructs
a zap to TSO processing. MVS380MN and GETMAIN are
also expected to use DIAGs at some point so that
each address space can get exclusive use of its
own ATL memory. This is called the "separate
memory" proposal. This would only happen when
NUMPART=1.
It is hoped that one day a large stack buffer
will be added to Hercules/380 main() which
will be later used when an EXEC PGM=XXX is
executed, and XXX will actually be an 80386
module operating in EBCDIC which returns
control to Hercules (and then MVS/380), after
manipulating the stack, whenever something like
an OPEN macro needs to be called. This will
allow people using MVS/380 to operate it using
their PC with next to no performance penalty.
It will become another PC operating system.
at the moment, and Peppe, I need a guinea pig.
PDPCLIB has moved into a position that I am
not happy with, and I'd like to work with you to
bed in a paradigm that I am happy with. And
the best thing for that would be to work with
that assembler program you were writing to
read an IEBCOPY unload file that had been
FTPed so you need to reconstruct the records.
Basically what I would like is for you to pretend
that BTL memory is tight so you are trying to
allocate 130 KB of memory in AT2B (RM32)
space to alleviate pressure on the RM31
space. The 130 KB is for two complete tape
records in case a V record spans two
blocks, and tape blocks are potentially 64 KB
in size.
I know you can use a simpler algorithm, but if
you use this slightly more complex algorithm,
it gives an actual program where we can
demonstrate how to write assembler programs
properly. Also the act of doing this means that
my writing needs to make enough sense that
a Unix programmer can understand it.
I can provide the code that does the AM32
handling if you can test it out. If it's all too
much work, that's fine, just forget it. And
non-Peppe people are free to comment on
the below proposals too.
Thanks. Paul.
Proposed future MVS/380 design and guidelines for
programming in all of MVS/380, MVS 3.8j, MVS/XA,
OS/390 and z/OS - 24-bit, 31-bit, 32-bit and 64-bit.
On the mainframe, just like modern (2018) Windows
and Unix, both 32-bit and 64-bit registers are
available for use, both for data registers and
address registers. The mainframe has traditionally
had an additional complication of address masking
which can restrict the amount of memory to 24 bits
or 31 bits. Unfortunately much code has been
written that makes an assumption that this masking
will occur, and the programs happily run with crud
in the top 8 or top 1 bit of 32-bit address
registers. In addition to that, some 32-bit programs
have assumed address truncation at 32 bits (by
using x'FFFFFFFF' as negative 1 in an address index),
so fail if an attempt is made to run them in AM64.
Those programs require an implied AM32.
But so long as programs are coded in accordance with
this document, there is nothing architecturally
preventing normal 32-bit programming or normal
64-bit programming, or even a combination of the two.
E.g. you could have 32-bit code pointers, 64-bit
data pointers, and 32-bit longs. It is totally up
to your C compiler and the options it provides and
you enable, or however you choose to write your
assembler code.
This document explains how to write your application
code in a flexible manner, but you have no control
over the operating system. Unfortunately much of
both MVS/XA and MVS/380 require execution in AM24,
and z/OS requires execution in AM31. As such, it is
necessary for your application to be able to switch
to the AMODE that the operating system requires,
noting that different operating systems can handle
different AMODEs. Your application should always be
more advanced or equal to the operating system (ie
you know in advance about all existing operating
systems before you write your program), as such,
your application is only ever expected to step *down*
to match the operating system, never *up*.
For a step-down to ever be possible, it requires that
the code that is being executed be located lower in
memory than the application data that is being addressed.
It also requires that any parameters that are given
to the operating system be located in lower memory
than the other application data. As a result of this,
there are distinct AMODEs and RMODEs:
RMODE: 24, 31, 32, 64
AMODE: 24, 31, 32, 64
Any combination of these can exist, provided that
the AMODE is always greater than or equal to the
RMODE.
So the RMODE is what this particular operating system
can handle, while the AMODE is what your application
has been written to allow. If you have written an
application that doesn't use "VL" or anything else that
may interfere with the top bit of a 32-bit application,
you may use AMODE 32. But all existing operating systems
require an AMODE of 24 or 31, so they will never load
your program into RMODE 32 or 64 space. But because
the operating system may allow 24 or 31, your program
will be marked differently depending on where it is
running. The software installer will typically need to
be informed whether the RMODE needs to be dropped from
RM31 to RM24 when installed on an MVS/380 system.
Because MVS/380 generally requires to be executed in
AM24, while z/OS can usually be executed in AM31.
At startup, your application can do a non-destructive
(works on any hardware) test to see what AMODE and
RMODE it has been invoked in, and it can adjust
appropriately.
The specific test for AMODE is to load a register with
x'FFFFFFFF' and then do a LA. If the top 8 bits have
been cleared, it is AM24. If no bits have been cleared,
it is AM32 or 64. If only the top bit is cleared, it
is AM31.
The specific test for RMODE is to load the address of
any routine. If the address is greater than or equal
to x'80000000', it is RM32. Else if the address is
greater than or equal to x'01000000', it is RM31.
Else the load module is RM24.
Testing 64-bit is different. At compile time you know
whether your program is using 32-bit or 64-bit data
pointers, and if you are building an AM64 program
you can just hardcode or assume the AMODE is 64. The
RMODE can be determined by doing the same test as
before, except using 64-bit registers and seeing if
any of the high 32-bits are non-zero, and if so, it
is RM64, not even RM32.
In the situation where the AMODE and RMODE are the
same, it means that the application and operating
system will execute in the same AMODE, so it is
not necessary to (and nor should you) switch AMODEs
prior to calling an operating system routine. By
sticking to this rule, your application will still
work on MVS 3.8j running on standard S/370 hardware,
because no BSM is executed. You are advised to limit
the rest of your application (or compiler) to
instructions that exist on S/370 too, and only use
MVCLE and CLCLE when you know you are dealing with
a buffer of 16 MiB or more in size.
Basically every operating system macro like READ
needs to be bracketed with some sort of macro like
GAMOS (go to the OS's amode) prior to execution,
and GAMAPP (go to the application's amode) after
execution. You also need to ensure that you build
your application on a system like MVS/380 where
the OS macros are 31-bit or 32-bit clean in order
for your application to be marked as something
above RMODE 24.
In some ways z/OS is better than MVS/380 and in
some ways it is the reverse. You are advised
to code for the lowest common denominator, and
MVS/380 is the best place to do that.
Let's first clarify where MVS/380 is superior.
z/OS does not provide any convenient way to
allocate AT4B (above the 4 GiB bar that 32-bit
addressing has as a limit) memory. It has an
IARV64 macro but the details of that are not
open source so it cannot be reproduced. Also
it is a dog to call (complex and space is
allocated as a minimum of 1 MiB). MVS/380 instead
provides LOC=64 as a parameter to conveniently
get RM64 (AT4B) memory.
Also z/OS does not provide a way for 32-bit
programs to obtain AT2B (above the 2 GiB bar
that 31-bit addressing has as a limit) memory.
Not even IARV64 will get around this problem.
The 2 GiB to 4 GiB range of memory is completely
out of bounds, which is something IBM has
copped flak for in the past. MVS/380 instead
provides a LOC=32 parameter to get RM32 (AT2B)
memory.
The RM64 deficiency of z/OS could be overcome
by having a 3rd party intercept for SVC 120
that would issue an IARV64 call. But the RM32
deficiency requires z/OS to be updated.
The good news is that you can code LOC=32, and
it switches on the high bit of the option byte,
while using the LOC=31 flags for the other 7
bits, meaning that you can write and run a
program on MVS/380 that uses the full 4 GiB
address space, and the program will still run
on z/OS, just in degraded mode, only being
able to allocate a maximum of 2 GiB of memory.
The LOC=64 parameter also switches on the top
bit of the option byte, while leaving the
other bits at the default LOC=RES setting,
meaning it will probably obtain BTL (below
the 16 MiB line, aka RM24) memory unless you
have an SVC 120 intercept in place.
z/OS is superior to MVS/380 mainly in the fact
that the OS routines can handle being executed
in AM31, so a module marked RM24 on MVS/380
can probably be marked as RM31 on z/OS, freeing
up the BTL memory that would otherwise be
occupied by your load module. Since load modules
are normally not very big, and the big space
filler is data, not code, this is of marginal
benefit. It does mean that GAMOS will not need
to execute a BSM, but that is also of marginal
benefit. There are other things too, like the
assembler handles a LG Rx,=LD(XXX), but this
is also of marginal benefit because the code
generator simply needs to generate a SLGR Rx,Rx
and L Rx,=L(XXX) to achieve the same thing, ie
load a 64-bit data pointer. And note that if
you are declaring a 64-bit pointer you would
typically have a YYY DS D.
Note that when doing a GETMAIN, if the current
AMODE is 24, then the LOC= will be ignored and
you will get RM24 memory. Ditto if the current
AMODE is 31, then a LOC=32 or LOC=64 GETMAIN
request will only obtain RM31 memory. In addition,
if the current AMODE is 32/64, then all GETMAIN
requests (ie even for LOC=24) will set all 64 bits
of R15 to provide a clean 64-bit programming
environment. On z/OS all 64 bits may be set even
if you are AM24/31. GETMAIN cannot distinguish
between AM32 and AM64 so a LOC=64 request will
obtain AT4B memory even if your 32-bit program
cannot handle it. A LOC=64 request will also
use 64-bit registers for the amount of data
being requested.
Another thing to note is that MVS 3.8j will
ignore the extra flags in the GETMAIN option
byte that it doesn't know about, so the
recommended practice for your programs that
are using 32-bit data pointers is to use
LOC=32 and mark your load modules AM32=64.
Your programs will still work fine on MVS 3.8j.
If you fail to specify a LOC= to your GETMAIN,
it will default to LOC=RES, ie the location of
the memory will depend on whether your module
has been loaded BTL or ATL. In general this is
not a good option, and should be avoided. You
should instead specify an explicit LOC=24 for
any data that any of MVS/380, MVS/XA or z/OS
require to be kept BTL, or an explicit LOC=31
or LOC=32 for any data that can be stored ATL.
This avoids both wasteful use of resources and
abends.
Don't use the MODE=31 option for OPEN as it
doesn't work on MVS/380.
Don't assume that R15 points to the program
entry point. On MVS/380 the bottom bit may
actually be 1.
When you write a 64-bit application in any
form, ie 64-bit longs and/or 64-bit data
pointers, you should establish a F5SA save
area on program entry and from then on use F4SA.
To cater to AM32 programs that rely on address
truncation at 32 bits, the entire 4 GiB to
8 GiB virtual address space is left vacant on
z/380 so that any virtual memory address in this
range is truncated. In addition, MVS/380 GETMAIN
will not return memory located in the last 2
4 KiB pages before 4 GiB nor the first 4 KiB
pages after 8 GiB. Nor will the operating system
use that space for anything else in case a
32-bit program does a STD 4095(1,2) where
registers 1 and 2 are set to x'FFFFFFFF'. Also
note that there may be an implementation where
there is a different PSW bit to distinguish
between AM32 and AM64, but for now they are
the same thing, and there is no flag in the
module to distinguish this anyway. There could
theoretically be a different way of identifying
AM32 programs in the future, such as maintaining
a list of them, but so far this has been deemed
unnecessary, and the main emphasis is that
AM64 programs should be treated as AM-infinity
and not expect truncation in the first place.
Unrelated to application programming, if you
can do a BSM to x'01' and it works, that is
a sign you are running on a minimal S/380
machine or above. You should do a SAM64 to
see if you are running on a z/380 or above.
There is a technical restriction on MVS/380
that ATL (RM31) memory is only cleared between
steps of a batch job, not between TSO commands.
It is expected that mvs380mn will be renamed
to mvs380, written in C (calling assembler
routines), and have a "clearatl" parameter
to clear the ATL memory until someone constructs
a zap to TSO processing. MVS380MN and GETMAIN are
also expected to use DIAGs at some point so that
each address space can get exclusive use of its
own ATL memory. This is called the "separate
memory" proposal. This would only happen when
NUMPART=1.
It is hoped that one day a large stack buffer
will be added to Hercules/380 main() which
will be later used when an EXEC PGM=XXX is
executed, and XXX will actually be an 80386
module operating in EBCDIC which returns
control to Hercules (and then MVS/380), after
manipulating the stack, whenever something like
an OPEN macro needs to be called. This will
allow people using MVS/380 to operate it using
their PC with next to no performance penalty.
It will become another PC operating system.